Network Synchronization with RsyncX and OSX

rsync_logo.jpg

RsyncX is a great tool for synchronizing data. It has a GUI as well as the rsync command line utility. Not only can it synchronize folders locally, but more importantly when used with a private ssh key, it can synchronize across a network without a password. This ideal for automated tasks (like cron) where having a clear-text version of a password stored in a file is not acceptable. It can come in extremely handy when working with everyone’s favorite operating system, OS X Server. You could use it to keep the content of a web server and a backup web server synchronized like Jen does, or you could back up your home directory to a server like I do. No matter how many times I set this up, I always forget some mundane detail. So while it is fresh in my head…..

For readability, the computer we want to synchronize from will be referred to as PITCHER and the computer we want to synchronize to will be referred to as CATCHER.

Install rsync on the PITCHER - Downloading and installing the RsyncX package is the easiest way to do this. Purists will want to build it themselves, and I’m sure there are DarwinPorts and Fink packages, but I’m too lazy to look.

Create a key on the PITCHER - I like to run these jobs as root using the command line. It adds the danger that if you screw something up you are totally hosed, but it also does away with any pesky permissions problems, so proceed at your own risk. First open up a Terminal window and get yourself some root access by typing sudo -s. Enter your root password and press return. If you haven’t enabled root access, here are some instructions. Type ssh-keygen -t rsa and press return to create your key.

rsync1.jpg

Hit return to save to the default location, return again for no passphrase, and return once more to confirm. A hidden directory (.ssh) should appear in /var/root (or wherever you have root’s home directory). Navigate to that directory and list the directory contents. You should see two files: id_rsa and id_rsa.pub. Copy id_rsa.pub to an easy to access location like your desktop.

Add that key to the CATCHER - Now get root root access with sudo -s on the CATCHER. Type ssh-keygen -t rsa on this machine as well and give it the same amount of return key presses listed above. The .ssh directory should appear in /var/root (or wherever you have root’s home directory). Navigate to that directory and list the directory contents. You should see two files: id_rsa and id_rsa.pub*. Yeah, that was all the same stuff. Now comes the tricky part. You need to take the id_rsa.pub from the “easy to access” location on PITCHER, rename it to authorized_keys, and and save it in the .ssh directory of the CATCHER.

Now the root account on PITCHER can transmit to the root account on CATCHER without a password. Yay!

Customize your sync job - Read the man page for rsync for insight on your options, but here is the basic formula for rsyncing over the network.

rsync [source folder on PITCHER] root@[IP or hostname]:[destination folder on CATCHER]

Here is (something like) the command I use to backup my home directory:

rsync -az -e ssh /Users/doug/ root@123.123.123.123:/Volumes/raid/doug_backup –delete –exclude=”.Trash*” –exclude=”*/Cache*” –exclude=”*.cache”

-az - (a) archive preserves just about everything including directory structure, and (z) compresses the data for transfer to reduce network bandwidth
-e ssh - specifies the use of the secure shell for the network communication
–delete - delete any files that are on CATCHER but not PITCHER (files that may have been within my source at the last backup but are no longer)
–exclude=”[file(s)]“ - do not copy these files. This option comes in handy when you are trying to prevent the backup of garbage files like cache or trash.

Create a cron job - once you have your rsync statement working flawlessly, it is time to automate. I use the crontab for this purpose; It is pretty straightforward and part of a standard UNIX build. On PITCHER, make sure you still have root access and type crontab -u root -e in Terminal. Your text-editor will open with root’s section of the crontab. It is probably blank, but you may to append to entries that are in there for other cron jobs root is responsible for. Cron statements have 6 variables serparated by tabs. They look like this:

[minute] [hour] [day] [month] [day of the week] [command]

So if you want your command to run every night at 10pm (22:00) it would look like this:

crontab.jpg

That’s it. Each time that crontab runs, the data on PITCHER and CATCHER will be compared. Any changes on PITCHER will then be updated on CATCHER. Let me know if you have any questions or run into any snags.

* - if authorized_keys already exists on CATCHER, this means a key has already been copied to this machine for some other purpose. You can overwrite that file if you know that that functionality is unused, or you can add your new key by opening the id_rsa.pub on PITCHER with a text editor, copying all the contained data, and append it to the pre-existing authorized_keys on CATCHER.

One Response to “Network Synchronization with RsyncX and OSX”

  1. jenz Says:

    Awesome article…this is one of those processes I always miss something when setting up.

Leave a Reply