Patrick Ward words, code, and music

A Few Notes on Using Rsync

I’ve had to work with Rsync recently for several projects that required syncing up code between servers. So, before I forget the little bits of information that I’ve gleaned in the last few days, here are some notes to remind me and anyone else that might need a refresher.

I have not needed to set up an Rsync daemon yet, so this post will focus on using it with an SSH shell.

Dual Rsyncing

#### Rsync must be installed on both ends of the pipe

If you’re using Rsync without a daemon through the SSH shell, you still need to have Rsync installed on both the local and remote computers. I’ve used Rsync in the past, but never had to worry about this simple fact because Rsync was always installed. So, when I created a new server that didn’t have Rsync installed by default, I was quickly reminded that Rsync works on both ends, not just the client side.

When Rsync communicates with a remote non-daemon server via a remote shell the startup method is to fork the remote shell which will start an Rsync server on the remote system. Both the Rsync client and server are communicating via pipes through the remote shell. As far as the rsync processes are concerned there is no network. In this mode the rsync options for the server process are passed on the command-line that is used to start the remote shell.

Customize SSH access

#### Use -e to control how SSH is used during the Rsync session

You’re using SSH, right? When you’re using it with Rsync, you can customize the SSH session itself, allowing you to set any of the SSH flags you might need during the session. For example, I have rather large login banners with lots of legal wording that pop up when you sign in to any of my servers. That’s nice, but I don’t need those when I’m running automated scripts for my own purposes. So, I always like to pass the -q flag to SSH which allows SSH to run with a little less noise. Using Rsync’s -e flag, you can pass a block of text that indicates both the shell and any required flags.

The following snippet uses Rsync with the SSH shell, passing both a specific port and the quiet flag:

#
# Syncing files over ssh, using the quiet flag
#
$ rsync -avz -e 'ssh -q -p 22' /local/dir username@remotehost:/destination/dir

The above command also makes use of some additional useful flags:

  • -a: use archive mode (explained next)
  • -v: use verbose mode (to tell me what’s happening)
  • -z: compress the file data during transfer (using less bandwidth)

Archive mode

#### An explanation of the combination of flags when using -a

Most of the Rsync examples I see use the -a flag, which is really just a shortcut for several other flags put together. So, I thought I’d list them here as a quick reminder to my future self.

Option -a is equivalent to a combination of the following:

  • -r: recurse into directories
  • -l: copy symlinks as symlinks
  • -p: preserve permissions
  • -t: preserve modification times
  • -g: preserve group
  • -o: preserve owner (onlyl if using Rsync as a super-user only)
  • -D: preserve device files (super-user only) and preserve special files

Checksum calculation

#### Using checksum when modification times and size don’t work

Sometimes, my modification times don’t match up correctly, or I’ve used a program that tends to update modification times on a file even though the file hasn’t changed. In that case, I find it’s useful to send the -c flag, or --checksum, to Rsync, which tells it to skip a file if it’s checksum matches the checksum of a file on the other side of the pipe rather than use it’s modification time and file size.

Warning, though, this can be quite a bit slower than a normal Rsync transfer because the checksum requires the process to do a full checksum on both ends of the pipe.

Using checksum instead of modification times:

#
# Checksum
#
$ rsync -avzc /local/dir username@remotehost:/destination/dir

Show Me the Progress

#### Use --progress to get a better picture of what’s happening during the transfer

Most of the time my Rsync transfers are fairly quick, but in those cases where I’m transferring larger files or lots of files at a time, I like to keep an eye on which files are being copied, at what rate they are being copied, etc. For those times, it’s useful to use the --progress indicator to give you a clearer picture of Rsync’s current process.

Using --progress gives realtime information during the transfer:

#
# Show me the progress
#
$ rsync -avz --progress assets/ username@remotehost:/destination/assets

Syncing assets...
building file list ...
94 files to consider
chunks/
site/less/
site/less/bootstrap/
templates/
templates/home.tpl
        4812 100%    3.92MB/s    0:00:00 (xfer#1, to-check=0/94)

sent 2970 bytes  received 108 bytes  2052.00 bytes/sec
total size is 1755190  speedup is 570.24

Making Exclusions

### Using --exclude and --exclude-from to tell Rsync what not to sync

For many of my projects, I have certain files that don’t belong on the destination side of the Rsync equation. For those, I create a simple file that follows the same basic pattern as exludes in GNU Tar or Git. For example, the following file can be used to exclude OS X .DS_Store or VIM .swp files from being transferred to the destination.

Filename: .rsync-exclude

.DS_Store
*.swp

I can then pass this filename to Rsync, telling it to ignore any files that match the patterns in that file.

Telling Rsync to exclude certain patterns from a file:

#
# Transfer, but ignore patterns in .rsync-exclude
#
$ rsync -avz --exclude-from '.rsync-exclude' assets/ username@remotehost:/destination/assets

If you don’t have a lot of patterns to match, you can also use the --exclude option to tell it specifically what not to transfer based on a pattern.

Telling Rsync not to transfer files that start with Foo:

#
# Ignore Foo files
#
$ rsync -avz --exclude 'Foo*' assets/ username@remotehost:/destination/assets

Transfer by Inclusion Only

#### Using --include to tell Rsync what not to exclude.

On the flip-side of exclusion are the inclusions you can give Rsync. There is also a --include-from option for matching patterns from a file. I find this can be useful if you want to specifically include certain files while excluding all others.

For example, the following will include only Bar files, but exclude all others.

Include only Bar files, excluding everything else:

#
# Only work with Bar files
#
$ rsync -avz --include 'Bar*' --exclude '*' assets/ username@remotehost:/destination/assets