A New Backup Strategy

I’m about to run out of disk storage for my Lightroom image catalog, so in preparation for a new iMac — one that supports a Thunderbolt disk system — I’ve decided it’s time to upgrade my backup systems. This is a long blog post, but it thoroughly covers what I’m now using for backup and what I learned in the process of getting to the final result.

My Former Backup Scheme

My backup strategy for the past three or four years was “pretty good”. My first-level backup was from my iMac to an Apple Time Capsule via Time Machine. For the second level, I made two copies at the end of every month of each of my internal SSD and 2TB rotating drive to two pairs of portable USB drives. Why two sets of removable backups? One set I kept off site in a storage locker. The other set I kept here at home for two reasons: (1) I’m actually paranoid enough that I wanted one set always off site (i.e., not in-transit) so I took a new set to the storage locker each month and only then retrieved the previous month’s set. Otherwise I’d have both the old and new sets at home simultaneously; (2) Although I’ve never had to recover from a major disaster like fire or theft, I have occasionally needed to recover a corrupted or accidentally deleted file. Having a full backup here in the house makes that very easy. Sure, I’ve got Time Machine, but I’ve had that completely fail on me and lost everything on the Time Capsule. Time Machine by itself is not an adequate backup solution.

New Backup Requirements

My new requirements are as follows:

  • 8TB of usable, local backup storage, updated multiple times/day from my iMac’s internal and external drives.
  • An identical server located at a remote location, replicated via the Internet daily and automatically from the local backup.
  • 12TB non-redundant Time Machine storage, separate from the above, for versioned files.

My Solution

After lots of research and testing, here’s what I’ve ended up with:

  • (1) Synology 214 DiskStation NAS (network-attached storage) server [US$300] with (2) Western Digital 4TB Red drives [US$175 each] configured as a single RAID0 (striped, non-redundant) disk group in a single 8TB volume for the local backup, connected to my iMac via Gigabit Ethernet.
  • An identical system to the one above, but located at a remote location and linked to the first one via the Internet. This is like having my own remote cloud server.
  • Carbon Copy Cloner [US$40] app for backing up the iMac drives to the local NAS server.
  • A 16GB USB 2.0 flash drive [US$9] as an OS X Recovery Drive.
  • Total cost: US$1,349, which doesn’t include my Time Machine storage.

I sync each of the two drives in my iMac to a separate shared folders on the local NAS backup server every six hours using Carbon Copy Cloner. Once a day, at midnight, I then replicate the shared folders on the local and remote backup servers using Synology’s built-in shared-folder syncing. I’m storing the files on the NAS servers in sparsebundles, the same format used by Time Machine. The synchronization uses the standard rsync utility, which transfers only disk blocks that have been modified since the last pass.

I don’t recommend this solution for beginners. I’ve got quite a bit of experience configuring and managing Linux servers, so these DiskStations are almost like old friends to me. Synology has done an excellent job in making their servers easy to setup and manage, but I still think it would be a bit scary and frustrating for someone who wasn’t already familiar with Linux and disk/file servers.

Intermission: If all you care about is the solution, you can stop here. But if you want to understand why I’ve settled on this solution for backup, and the tests and considerations that went into making these selections, read on!

Synology DiskStatsion

When I began this project, I’d never heard of Synology. But once I started looking around, the company’s name came up over and over again. They’re well regarded and now that I’ve had two of their DiskStations for a few weeks, I think the praise is well deserved. The boxes are compact Linux servers based on a variety of ARM processors. The DiskStation Manager (DSM) software includes an excellent browser-based management GUI and supports plug-in apps, some from third parties. There’s a large community of users and sysadmins with a lot of experience with Synology DiskStations, so there’s an abundance of information and support available online.

Some of the features I’ve configured in my two DiskStations include:

  • ssh (secure shell/terminal) access
  • email and SMS error notifications

Why Not Just Use Time Machine?

You can certainly do that. The Synology DiskStations support Time Machine backups, but Time Machine doesn’t allow me to control the scheduling of backups. They happen every hour. And although it supports versioning, I just don’t trust it due to failures I and others have experienced in the past.

That said, I do plan to add another Synology NAS server for Time Machine once I’ve outgrown the 3TB capacity of my existing Time Capsule (ie, real soon now). My current plan is to use a Synology DS414j [US$409] and (4) Western Digital 4TB Red drives in a RAID5 configuration to give me 12TB of usable storage.

Note that Time Machine will not backup any NAS or other networked drives. Only those that are internal or connected to your computer via USB or Thunderbolt can be archived.

What About the Cloud?

I considered and even tested cloud-based solutions like CrashPlan, Amazon Glacier and Backblaze. Bottom line: They’re impractical for multi-terabyte storage. It would take many months to perform the initial backup via the Internet. Some services allow you to “seed” the backup by sending them one or more hard drives, but then you have to go through the reverse procedure if you need to restore after a major data loss. And most are relatively expensive. (I’m suspicious of those that are free.) And then there are the questions of security and redundancy. I’d rather know where my files are and who has access to them. In any case, 8TB in the cloud just isn’t practical using today’s services and bandwidth.

What About CrashPlan?

CrashPlan supports backup in two ways. Most people know the company for their free unlimited cloud storage. But the CrashPlan app can also be used to backup one computer to another even if the second is a remote machine, e.g., at a friends house. I decided CrashPlan (the cloud service) wouldn’t work for me due to the above reasons. But I did want to see if the app might be helpful.

First I tried running CrashPlan to sync my files to the local NAS server. It has the advantage that it properly backs up special folders like /Applications and it also saves permission and other attributes. The problem is that it’s very slow. Local transfers rarely hit speeds of 10MB/second even over a GB Ethernet link. Another disadvantage is that CrashPlan’s backups are stored in a proprietary format that only CrashPlan understands. At least that’s how it appeared to me.

There’s also a version of the CrashPlan app that can be installed on the Synology DiskStations. This could be used to sync a Synology server to the CrashPlan cloud or to another Synology DiskStation. I did not install or test this configuration.

How About Amazon Glacier?

Amazon Web Services offers inexpensive storage for archiving. Data are rumored to be written to good old-fashioned magnetic tape, which means write speeds are reasonably fast but reads can take a long, long time. Storage is $0.01/GB/mo or $10/TB/mo. That would be $960/year for 8TB or more than the one-time cost for a remote 8TB Synology NAS server at a remote location.

NAS or DAS?

With the new iMac, I’ll be using a multi-drive Thunderbolt-connected DAS (direct-attached server) in a RAID0 configuration. Thunderbolt because it’s fast, and RAID0 because it’s fast, too. The advantage of DAS in this case is that the drives appear to OS X pretty much as built-in drives once they’re connected. NAS devices, on the other hand, look to OS X like remote servers with a much more generalized and abstracted interface.  (Note: I haven’t yet selected a DAS vendor or model yet.)

A single Thunderbolt v1 channel operates at up to 10gbps or about 1,000MB/sec, so reading and writing to any Thunderbolt drive is generally limited by the speed of the disk(s). For comparison, an Apple Fusion drive can read and write at roughly 470MB/s and 280MB/s respectively for the first 100GB, then drops to about 80MB/s (read/write) because its SSD cache fills up. [MacWorld]

NAS doesn’t support OS X’s Trash folder, but it does have a comparable #recycle folder, which is optional at the shared-folder level.

Why Not Drobo?

Many photographers (including many that I trust and respect) use Drobo systems for backup. You can get a Drobo system for either NAS or DAS (ie, Thunderbolt). They use a redundant disk configuration similar to RAID5, but superior according to the company. Personally, I and other users have been burned by the proprietary nature of this scheme and the company’s lack of willingness to support failed hardware. I lost all my files on a Drobo backup a few years ago when the enclosure itself failed, the company refused to repair it and the drives couldn’t be read in another brand’s system because of their proprietary configuration.

Drobo also doesn’t support server-to-server syncing, which is critical for my scheme. They also don’t offer a high-performance striping options such as RAID0, which is my preference for the local backup server.

My understanding that due to a management change, things have improved at Drobo, so although the above reasons disqualify the company’s products for my primary and remote backups, I may yet use one of their systems for my higher-capacity Time Machine backups.

Why RAID0 vs RAID5?

I’ve organized my disk drives as RAID0, which means they’re optimized for maximum capacity and speed. The speed comes from the fact that as data are written, half of it goes to each drive (or 1/n to each drive where n is the number of drives). That means you can write and read data at speeds approaching twice (or n times) the speed you can write to a single drive. For example, if one drive can write at 150MB/sec, two drives in RAID0 might be be able to write at 290MB/sec and four drives at 580MB/sec. [BSN] With RAID0, writes are usually even faster than reads, but the relative numbers are comparable. The downside of RAID0 is that you actually reduce the reliability as compared to any other disk configuration. That’s because you’re twice (or n times) more likely to experience a drive failure and the failure of any single drive means you’ve lost all the data.

A RAID5 configuration is pretty much the opposite of RAID0. RAID5 is slower, more reliable and offers lower capacity. For example, a n-drive RAID5 configuration gives you (n-1) times the usable storage of a single drive. (eg, 4x4TB yields 12TB usable.) But your data will survive the failure of any single disk drive in the group.

Since this is for backup storage (ie, I don’t need fast access to it), why RAID0 instead of the more-reliable RAID5? Two reasons:

  • RAID5 (or any configuration) is still vulnerable to failures of power supplies, controllers or software. A hardware or software glitch can wipe out all your files in an instant, so RAID5 doesn’t really mitigate the “single point of failure” problem. It only addresses failures of disk drives.
  • There’s little advantage in redundant redundancy. The local NAS server is the first level of redundancy. The remote server is another level. Because my local backup is synced to a remote copy, there’s not much to be gained by increasing the reliability of the local server. If I lose the local backup server and the original files, I still have the remote backups. That’s what it’s for. Besides, RAID5 costs more, yields less capacity and is slower.

One other consideration is the migration of corrupt files. A file that becomes damaged on my iMac will, within a few hours, be synced and therefore corrupted on both my local and remote backups as well. Likewise, if I accidentally delete a file on the iMac, it will soon be deleted on both of my NAS backup servers. This is why I also run Time Machine to a Time Capsule or (eventually) to yet another NAS server. I actually find I recover individual files far more often from Time Machine than from my local and remote synced servers.

When I outgrow my 3TB Time Capsule, I will deploy the RAID5 server for Time Machine. I’ll gain the ability to recover from the failure of a single drive, but nothing else. This server won’t itself be backed up or replicated, but it seems like a reasonable compromise.

Sparsebundles

While CCC and other apps can backup to any locally connected storage system that supports an HFS+ file system (including USB- and Thunderbolt-connected systems) they can’t backup certain files to an NAS system because NAS doesn’t support the OS X’s extended file attributes. [Bombich] This means you can’t backup /System, /Applications and other critical directories.

The solution is to write backups to disk images, which are exact block-for-block copies of your hard drives rather than duplicates of the file systems recorded in those blocks. There are three types of disk images in OS X:

  • A .dmg file is a literal copy of a disk. If the hard drive is 1TB, the .dmg file will also be 1TB.
  • To be more efficient, Apple created the sparseimage format, which is only as large as the amount of disk space you’re actually using. So if you’re using 350GB on your 1TB drive, the .sparseimage copy will be only 350GB. I say only because a sparseimage, like a .dmg, is still one huge file, which makes it more difficult to maintain and replicate when only a few files on the disk drive are changed or deleted. Sparseimages expand as necessary, but can only be resized downwards using the Disk Utility or hdiutil.
  • To make diskimages easier and more efficient to maintain, Apple next created the sparsebundle format. Instead of using one huge file, the disk image is broken into 8MB files called bands. This is the format now used by Time Machine. It makes it much easier to change, add and delete individual files within the disk image.

For these reasons, and because it’s supported by Carbon Copy Cloner, I’ve chosen to use the sparsebundle format for my backups.

Sparsebundles appear to be somewhat vulnerable to corruption. Here’s some info on how you might repair one that’s damaged: http://goo.gl/g633FY. I don’t know the cause, but in the past I have lost an entire sparsebundle on my Time Capsule. At that time I didn’t know about possibly repairing it, so I can’t say how effective the repair procedure might be. Let’s hope I don’t have to find out!

Carbon Copy Cloner

I’ve been using Mike Bombich’s Carbon Copy Cloner [US$40 Bombich] for many years and have found it to be simple, reliable and flexible. It does exactly what I’ve needed. When I started this project it never occurred to me that I’d end up using CCC as my new backup application, but after trying or testing many alternatives, that’s where I’ve ended up.

Carbon Copy Cloner is the only app I’ve found that can do all of the following:

  • read and write sparsebundles to an NAS server
  • allow for the selection of what is and isn’t backed up
  • backup all of OS X’s extended attributes and hidden files
  • support flexible scheduling of backups
  • generate OS X-integrated log files
  • email notifications of errors

What About SuperDuper?

SuperDuper [US$27.95] is a very popular package for making full-drive backups from OS X, so I thought I’d give it a try. I was able to backup to a sparsebundle on the NAS server, but SuperDuper didn’t give me the option of selecting which directories or files to include or exclude. True, I’m not excluding anything at the moment, but it would be nice to have the choice. The transfer speed was only about 65% that of using Carbon Copy Cloner to backup the same data.

The real killer was that I couldn’t convince SuperDuper to read/restore from the same sparsebundle it created on the NAS. Even more surprisingly, CCC could read that sparsebundle. If anyone can figure out how SuperDuper can read sparsebundles, please leave a comment and let me know. For now, SuperDuper just doesn’t make the cut.

What About ChronoSync?

ChronoSync is another popular backup app for OS X. It’s highly configurable and can backup the extended permissions and attributes via AFP. Unfortunately, it was very slow (3.3MB/second) even when using a GB Ethernet link. Furthermore, some files such as .wav, .gz and Linux executables weren’t backed up. (Maybe I had some option set wrong?) And finally, the app didn’t allow me to select individual directories and files to backup or restore.

Ethernet vs. Thunderbolt

Thunderbolt v1 has a data rate of 10gbps so it can transfer data at up to about 1,000MB/s. Thunderbolt v2 can support up to 2x that rate. Gigabit Ethernet is 1/10th speed of Thunderbolt v1, so it can handle just a bit more than 100MB/s. Since a RAID0 system can easily handle reads and writes faster than the speed of GB Ethernet, some systems (including some 2-port Synology servers) support link aggregation allowing you to use two GB Ethernet connections between devices, supporting data transfers at up to about 220MB/s. While this is valuable when you use an NAS box for primary storage, it’s less important for backup. In fact, because I want to use my computer during backups, I don’t want the backups to heavily load my network connections or CPU. The only time I need high-speed transfers for backup is when I’m creating the initial copies of my drives. That can take many hours, so every bit of speed is appreciated. But speed is the main reason I plan to use Thunderbolt and DAS, not NAS (which is always via LAN) for my primary image storage. My old iMac doesn’t support Thunderbolt, so I’m just holding on with the internal drive for as long as possible.

When I initially backed up my 2TB internal drive to a Synology DiskStation, my transfer rate averaged 62MB/s using a single GB Ethernet link. It would have been faster, but adding the overhead of encryption on the DiskStation slowed it a bit.

Encryption

I knew I wanted my data to be encrypted on the remote NAS server and during transmission over the public Internet. I thought about not encrypting my Lightroom images and other non-critical data for the sake of speed and simplicity, but I gave up on that and just decided to encrypt everything.

There were two choices:

  • have Carbon Copy Cloner encrypt the data as it left my iMac, or
  • have my local Synology DiskStation encrypt the data at the Shared Folder level.

It turned out that using CCC to encrypt the data used quite a lot of CPU time, and since I didn’t want to burden the iMac while I was using it for other things, I discounted that option. The Synology servers have fairly slow processors, so it’s a lot to ask them to do the encryption. But other than during the initial backups I’m not in all that much of a hurry, so it doesn’t really matter that the encryption slows down the process. In fact, I like that it reduces the transfer rate between the iMac and the local DiskStation, thereby even further decreasing the impact on what else I might be doing and the load on my LAN.

I had the option of just encrypting the data during transfer between the local and remote DiskStations (over the Internet), but I wanted the simplicity of using Synology’s Shared Folder Sync between the local and remote servers, and that means both copies had to be either encrypted or not. Of course, once you decide to encrypt the folders, there’s no need to also encrypt the transfer, so I’ve got that turned off.

I would have preferred to keep the data unencrypted on the local backup server (for performance) then have it encrypted over the Internet and on the remote server. But I couldn’t figure out a way to do this.

Backup Schedule

I haven’t detected any problems from simultaneously backing up a drive on the iMac to my local DiskStation while I’m also syncing that DiskStation to the remote one. But because of the way sparsebundles work, I don’t want to take the chance of their getting out of sync. Remember, Synology’s Shared Folder Sync is just replicating the individual 8MB band files within the sparsebundle. Who knows what happens if some bands are updated but others are not. It’s fairly easy to imagine corrupting a sparsebundle. For this reason I’ve staggered the operations such that I’m less likely to be backing up an iMac drive to my local NAS server at the same time as that local NAS server is syncing with the remote one. My schedule is, therefore:

  • SSD: Using CCC to backup to the local NAS every six hours starting at 4am.
  • 2TB Drive: Same as above, but starting the local CCC backup at 4:30am, again repeating every six hours.
  • At midnight, the local NAS server syncs to the remote one. This time was chosen to minimize the disruption due to Internet bandwidth utilization. The sync uses nearly 100% of my 11mbps upload bandwidth and 20% of the remote location’s download throughput. (I’m in the process of testing router-based bandwidth throttling.)

Each drive gets backed up every six hours. The remote copies are at most 24 hours old.

I considered the option to “Run sync on modification” between the local and remote NAS servers, but this just seems too risky. Because it’s the bands of the sparsebundles that are actually being replicated and it’s not possible to understand how the files are spread across the bands, there are too many opportunities for both errors in replication and incomplete backups due to file locks when one tries to replicate while a CCC backup is underway. (I may need to revisit this option after more experience.)

Performance Tests

I measured the following performance, all using a single GB Ethernet connection between devices. The test consisted of 13.27GB of actual files from my /Applications folder. My selected configurations are marked below by an asterisk (*):

  • Cloning using CCC from the iMac to a sparsebundle in an NAS shared folder:
    • unencrypted: 68.0MB/s
    • encrypting via CCC on the iMac to an unencrypted shared folder: 58.2MB/s (1.17x slower)
    • to an encrypted shared folder: 14.3MB/s (4.8x slower)*
  • Syncing a shared folder between two local Synology NAS DiskStations:
    • unencrypted: 40MB/s
    • folder encrypted but no additional encryption during transfer: 35.5MB/s*
    • folder not encrypted, but using encryption during transfer: 17.7MB/s (17.1x slower)
  • Syncing a shared folder over the public Internet:
    • Speed is limited by my upload bandwidth (maximum 11mbps, average 8.9mbps).
    • 11 minutes of overhead to compare local and remote folders even if there are no changes to sync. Plus…
    • 15 minutes/GB (one hour to sync images from a full 4GB SD card)

A Bootable Recovery Disk

If you need to completely restore your Mac such as after replacement of a trashed primary drive, you’ll need a Recovery Disk available. The easiest way I’ve found to create this is to use a small USB 2.0 Flash Drive. First, erase/format it using the Disk Utility with an OS X Journaled case-sensitive file system. Then download and run the Recovery Disk Assistant. Put the USB drive away and hope you never need to use it. You should create a new recovery disk for every major OS X update.

7 thoughts on “A New Backup Strategy

  1. Hi Doug,

    Thanks for the writeup. Do you know of any disc clone/copy software (Carbon Copy, ChronoSync, etc…)that will verify the integrity of the source disk before it clones.

    I am currently using a Gdrive Studio 8TB running in JBOD, w the second disc as a backup, being cloned to daily with Chronosync. I am afraid however that my backup is pointless. If the first disc ever gets corrupted before I notice it will clone all the corrupted data to the backup drive. So right now I fear I am getting very little protection from single drive failure.

    Thanks for your input.
    Connor

    Like

    1. Connor: You can run the “fsck” utility from the command line. (File consistency check and interactive repair.) If you know how to write Linux/OS X shell scripts you can create a cron job that first runs fsck, then depending upon its success, report an error or run rsync, which is used by most of the cloning apps. I’m not doing that. Instead, I’m using Time Machine to protect me against the corruption of individual files since Time Machine should be able to keep a pre-corrupted version. In theory at least.

      Like

  2. I think (hope) you’re wrong about the Glacier costs, the cost is $0.01/GB not $0.01/GB/month, well we will find out next month if I get billed again.

    Dan

    Like

  3. What about an iSCSI Initiator/Target of the Synology for TimeMachine? Then the Synology would appear as local native storage. Am just thinking out loud of ways to reduce the number of disks going on. Also, the small app “Time Machine Editor” does allow you to schedule specific TM backups, not every hour, day, etc.

    Like

Leave a comment