Backing up remote servers with rsnapshot

Piggybacking on existing tools for easy backups

James Booth

10 minute read

featured-image

To back up remote my Linux hosts, I use a tool called rsnapshot. This utility is a wrapper for the venerable rsync (Think robocopy on steroids if you’re from the Windows world) and leverages SSH as a secure transport mechanism. It can be run both locally and remotely.

The strength in using rsnapshot over plain old rsync is that rsnapshot creates snapshots (Who would have guessed?) and rolls them over into a defined number of recovery points. It does this by creating incremental backups for each subsequent backup after the first full, meaning any unchanged files are not duplicated, but instead are hardlinked (A pointer to the existing file). This saves on bandwidth for transferring data and local storage as no duplication of data happens.

Additionally, when setting up rsnapshot, we configure backup intervals which are used to roll backup points over in a Grandfather-Father-Son style rotation. More on this below.

Before I start, both man rsnapshot and the howto are great places to start.

Backup intervals

I’ll cover configuring backup intervals in the config below, but I wanted to quickly explain how the backup intervals work in rsnapshot.

Backup intervals are used to define snapshots levels that will ‘roll over’ after a defined limit, and we must pass the backup interval we wish to run to the rsnapshot command. Generally, the most common backup intervals used are hourly, daily, weekly, monthly, (etc.) but they can be named anything - Go with what has the most meaning to you.

For my example, I’ll be using the following backup intervals:

Backup point Points to retain
hourly 24
daily 7
weekly 4
monthly 3

Running a backup interval is as simple as defining it in the config file then running rsnapshot -c <config file> <backup interval, eg. hourly>.

Snapshot names

The backups themselves will be stored within the defined snapshot_root under a folder named after the backup interval and the current number in the chain, starting at 0.

Upon first run of the hourly backup, we would have a hourly.0 folder. Upon second run, the hourly.0 would be renamed to hourly.1 and the most recent backup would be hourly.0. This keeps happening until we hit the limit we have defined, and the oldest point - hourly.23 in our case - is deleted, and all of the snapshots are rolled over again.

If we only ever ran the hourly backup, we’d only end up with hourly.x folders - This isn’t much good. To roll the hourly backup interval over to the next backup interval, we need to actually run the rsnapshot and specify daily as the backup interval. This will take the oldest snapshot of the interval above it (hourly in our case) and move that to daily.0.

Rinse and repeat this process with each interval at the desired time, and they’ll all do the job of pulling from the lower interval and roll over their own backups up until the defined limit for the interval.

PLEASE NOTE: As rsnapshot does not actually perform any scheduling of backups, we’ll hand that over to cron. This means we will have to ensure that the hourly backup runs on the hour, the daily backup runs once per day, etc. This is demonstrated below, under the Running a backup section.

Identifying a snaphot

As I’ve covered, the .0 backup is the most recent, with each previous backup being incremented by 1 until the limit is hit.

So, to identify a daily snapshot that was 3 days ago, we would look for daily.2, as…

  • daily.0 is yesterday’s - 1 day ago.
  • daily.1 is the day before yesterday - 2 days ago.
  • daily.2 is the day before that - 3 days ago.

But after all that, the easiest way to quickly identify which backup point is which, is to just ls -l the snapshot_root directory and you will see that the folder creation date is when the backup job ran.

Setup

Enough about how it works and let’s get it working.

On the host to be backed up

  • Create a local backup user on the remote system called rsnapshot.

    useradd --user-group --shell /bin/bash --create-home rsnapshot
    
  • Set a secure password for the user.

    passwd rsnapshot
    
  • Allow this user to use sudo without password, but only for the commands it requires (rsync and mysqldump in my case).

    sudo visudo
    # Add the following line
    rsnapshot  ALL=(ALL) NOPASSWD: /usr/bin/rsync,/usr/bin/mysqldump
    
  • For backing up MYSQL: Create .my.cnf in user’s home directory, ensuring only the rsnapshot user has read access.

    [client]
      user = '<mysql user>'
      password = '<mysql password>'
    
  • Create folder on destination host to stage the mysqldump output (eg. /var/cache/rsnapshot or in the home directory) with full permissions.

On the server receiving the backups

  • First of all, we’ll need to get rsnapshot via. our package manager.

  • Next, generate the keys we’ll be using for passwordless authentication by running ssh-keygen and specifying a path (eg. ./keys/<remote host>/id_rsa).

  • It is usually advisable to ensure that the ssh key files have the following permissions:

    • SSH folder (.ssh/) - 700 (drwx-–—)
    • Public key (id_rsa.pub) - 644 (-rw-r–r–)
    • Private key (id_rsa) should be 600 (-rw––—)
  • Once we have keys created, we will need to transfer them to the remote host for key-based authentication.

    ssh-copy-id -i <path to generated keys>/id_rsa rsnapshot@<remote server>
    
  • We can test this by SSHing into the remote server.

    ssh -i <path to generated keys> rsnapshot@<remote server>
    
  • Now that we have access to the server, it’s time to configure rsnapshot.

rsnapshot configuration

rsnapshot comes with a default config at /etc/rsnapshot.conf and will look to use this if no config file is explicitly defined. Instead, we will create our own config and pass it to rsnapshot the -c <config file> argument, as this lets us keep multiple copies for different servers.

I have tweaked the default rsnapshot config for my usaged and I’ll go through it piece by piece below.

PLEASE NOTE: The spaces between a config key and its value MUST be tabs otherwise rsnapshot will complain at you and not work. To verify a config before use, run rsnapshot -c <config file> configtest. You can also perform a dry run of any backup points with rsnapshot -c <config file> -t <backup point, eg. hourly>

  • General config

    # rsnapshot.conf - rsnapshot configuration file
    # =================================
    config_version  1.2
    
    # All snapshots will be stored under this root directory.
    # =================================
    snapshot_root   <backup root>/backups
    
    # Verbosity & logging
    # =================================
    verbose         3
    loglevel        3
    logfile         <logs>/rsnapshot.log
    
    # Binary locations
    # =================================
    cmd_cp          /bin/cp
    cmd_rm          /bin/rm
    cmd_rsync       /usr/bin/rsync
    cmd_ssh /usr/bin/ssh
    cmd_logger      /usr/bin/logger
    cmd_du          /usr/bin/du
    #cmd_rsnapshot_diff     /usr/bin/rsnapshot-diff
    #cmd_preexec    /path/to/preexec/script
    #cmd_postexec   /path/to/postexec/script
    
    #ssh_args        
    #rsync_short_args       -a
    #rsync_long_args        --delete --numeric-ids --relative --delete-excluded
    

    Here we define the path to where we want to store our backups (snapshot_root), the logging info and the paths to any binaries that will be used in the process. I’ve also increased the verbose and loglevel to display more information.

    I’ve left in the ssh_args and rsync_short_args (commented out) in case I wanted to tweak them later, as these will be passed through to the underlying ssh/rsync calls.

  • Backup intervals

    # Backup intervals
    # =================================
    # Must be unique and in ascending order
    # i.e. hourly, daily, weekly, etc.
    retain          hourly  24
    retain          daily   7
    retain          weekly  4
    retain          monthly 3
    

    As I covered at the start of this post, these backup intervals are what define the different increments of backups. The most frequent should go at the top of the list and the most infrequent will be at the bottom, with backups cascading down the list.

  • Backup points

    # Backup points
    # =================================
    backup  rsnapshot@<remote server>:/home/              backup/
    backup  rsnapshot@<remote server>:/etc/               backup/
    backup  rsnapshot@<remote server>:/var/www/           backup/
    backup  rsnapshot@<remote server>:/usr/local/         backup/
    

    These are the paths we wish to back up. In this example, we’re backing up a remote host, but if we removed the rsnapshot@<remote server> it would perform the backup locally. They will all be saved into a subdirectory called backup under the snapshot_root.

    There are some more options to exclude specific files & folders, but I have not included any in mine as I want it as-is.

  • Backup scripts (optional)

    # Backup scripts
    # =================================
    # Dump the mysql backup to /var/cache/rsnapshot_remote/mysqldump (on remote host)
    backup_script   /usr/bin/ssh -i <path to id_rsa> rsnapshot@<remote server> '/usr/bin/mysqldump --all-databases > /var/cache/rsnapshot_remote/msqldump'    unused0/
    # -- Note: Be careful with quotes as closing the quote before the > operator would result in the file being created in the LOCAL path specified
    # Now back up the mysql dump
    backup  rsnapshot@<remote server>:/var/cache/rsnapshot_remote/        backup/
    

    Here we can run any scripts, either on the local or remote system. Obviously if any scripts need to be run on the remote system, we will prefix them with ssh as appropriate.

    The above example will first use SSH to run a mysql database dump onto the remote file system. The entry below that will then back it up. We can run script files or as the above example shows, inline remote commands.

  • And all together now…

    # rsnapshot.conf - rsnapshot configuration file
    # =================================
    config_version  1.2
    
    # All snapshots will be stored under this root directory.
    # =================================
    snapshot_root   <backup root>/backups
    
    # Verbosity & logging
    # =================================
    verbose         3
    loglevel        3
    logfile         <logs>/rsnapshot.log
    
    # Binary locations
    # =================================
    cmd_cp          /bin/cp
    cmd_rm          /bin/rm
    cmd_rsync       /usr/bin/rsync
    cmd_ssh /usr/bin/ssh
    cmd_logger      /usr/bin/logger
    cmd_du          /usr/bin/du
    #cmd_rsnapshot_diff     /usr/bin/rsnapshot-diff
    #cmd_preexec    /path/to/preexec/script
    #cmd_postexec   /path/to/postexec/script
    
    #ssh_args        
    #rsync_short_args       -a
    #rsync_long_args        --delete --numeric-ids --relative --delete-excluded
    
    # Backup intervals
    # =================================
    # Must be unique and in ascending order
    # i.e. hourly, daily, weekly, etc.
    retain          hourly  24
    retain          daily   7
    retain          weekly  4
    retain          monthly 3
    
    # Backup points
    # =================================
    backup  rsnapshot@<remote server>:/home/              backup/
    backup  rsnapshot@<remote server>:/etc/               backup/
    backup  rsnapshot@<remote server>:/var/www/           backup/
    backup  rsnapshot@<remote server>:/usr/local/         backup/
    
    # Backup scripts
    # =================================
    # Dump the mysql backup to /var/cache/rsnapshot_remote/mysqldump (on remote host)
    backup_script   /usr/bin/ssh -i <path to id_rsa> rsnapshot@<remote server> '/usr/bin/mysqldump --all-databases > /var/cache/rsnapshot_remote/msqldump'    unused0/
    # -- Note: Be careful with quotes as closing the quote before the > operator would result in the file being created in the LOCAL path specified
    # Now back up the mysql dump
    backup  rsnapshot@<remote server>:/var/cache/rsnapshot_remote/        backup/
    

Running the backup

Now that we have set up the remote user, created SSH keys to authenticate and configured rsnapshot, we’re left with actually running rsnapshot!

To manually run an hourly backup now, simply run sudo rsnapshot -c <config file> hourly. As long as all of the SSH auth and permissions have been set up okay, the backup should run.

But, we won’t remember to do this at every interval, so we will add this to our crontab. This is a slightly altered version of the one that ships with rsnapshot.

# At minute 0, every hour
0 *     * * *           root    /usr/bin/rsnapshot hourly
# At 03:30 every day
30 3    * * *           root    /usr/bin/rsnapshot daily
# 3AM every Monday
0  3    * * 1           root    /usr/bin/rsnapshot weekly
# 2:30 on the first day of every month
30 2    1 * *           root    /usr/bin/rsnapshot monthly

And if you’re as rubbish as me at remembering crontabs, you could always cheat and use crontab.guru.

Other bits

Taking a full backup from a snapshot

Because hard links are pointers to the data (not symbolic links or shortcuts), if we want to take a full backup from a snapshot (not just the incremental data), we can simply take a copy of a snapshot.

  • Enter the snapshot root (/var/cache/rsnapshot by default)
  • Run an ls -l to list all of the folders. The date next to the folder name is the backup time (When the directory was created).
  • Simply copy, or even better, compress a directory.

    tar -zcvf Monday2017-07-25.tar.gz ./daily3
    

Restoring a snapshot

Easy enough, just copy your data back (As root). If you have any database dumps, just import them as you normally would. They’re all just files.

Monitoring backups

The next step is to monitor backups. You can do this via. cron, or alternatively, wrap this process in another script. I use a python wrapper that reports status to Sensu.

Next steps

To improve the process, we might want to do some of the following:

  • Ship the logs (Path specified in the config file) to a logging server, eg. Elasticsearch.
  • Set up an iSCSI/SMB volume to store my backups on.
  • Write a Salt state/Ansible playbook to deploy rsnapshot backup jobs (along with templated config files) and crontab entries.
  • Check out something like Elkar to extend this (On my to-do list!)
comments powered by Disqus