Highly Reliable Systems: Removable Disk Backup & Recovery

The Reliability of using Removable Drives and Mirroring.

By Tom Hoops

Customer Question:  We plan to use the 2 Bay Premier as a target for a continuous backup (either Appasure or ShadowProtect) as described in your recent blog post on mirroring removable drives.  Basically, the backup job would continuously run (every 15 minutes or maybe every hour creating incremental updates).  You suggested swapping the bottom drive each day, and that the automatic mirroring (AMT)  would start a new mirror each night.  My tech is concerned about the strain of breaking the mirror and recreating the mirror each day.  I thought he had a good point. What kinds of issues does that pose for the integrity of the unit and the drives?Automatic Mirroring Technology

Answer: The 2-bay is specifically designed to accommodate the “broken mirror” concept for softwareless backup.  What we mean by softwareless is that the backup software and host machine is unaware that an additional copy of the data is being made.   Suppose we had 3 total swap drives (4 hard drives total) and left one drive in at all times as the “primary”. There are several issues we could discuss here:

1) Will the connectors on the back of the removable tray accomodate hundreds or even thousands of plug/unplug cycles?  The answer to that concern is yes.  We’ve been asked why we don’t expose the bare SATA drive and use that as the rear “docking plug” to save costs.  Those SATA connectors are spec’d at only 50 insertions by the committee. If you look at the type of connector we use, you’ll see it is a pin type connector with high insertion ratings.  While somewhat non-traditional, it is this connector that provides reliable daily connection.

2) The primary drive (the one left in place) gets high read activity every day.  We assume you will swap media every day causing a full remirror.  This requires reading every data block on the drive so that it can be written (mirrored) to the secondary drive.  It could be argued that this extra activity creates wear and tear on the hard drive during the daily full backup.  Will the primary drive fail more rapidly for this reason?  Well, we haven’t seen a failure correlation like this.  Our head engineer suggests that if this is a concern there is no inherent reason why you couldn’t rotate the swaps – rotate the right hand (or bottom) drive one day and after the mirror is sync’d rotate the left hand (or top).  The Automatic Mirroring Technology (AMT) doesn’t care and you could balance total read activity this way.  If it makes you feel better, by all means do it. But you will be “fixing” a problem that we’ve never seen happen.

3) The secondary drives (the one swapped each day) will have power removed and applied each day.  This power cycle load is spread out over the 3 swap drives (in this example).  The question is: are hard drives like light bulbs? – Do they often fail when power is applied?   Well, we’ve never seen a drive go “poof” when it was turned on – at least not that we attribute to a power influx. The raw drive has it’s own hot plug ability (Hot plug was added to the SATA II spec) and our trays do have protection circuitry.  We also mitigate this issue as best we can by requiring the key to be turned before the High-Rely classic media is removed.  This additional step provides even more protection.

4) Are there any anomalies (bugs) in the mirroring circuit that could cause corruption to occur after many swaps?  We aren’t aware of any.  It’s been in use this way since we first introduced it back in 2008 and have not seen issues with the re-mirroring process.  It does bring up an interesting point though.  We think it would be a good “best practice” to periodically run CHKDSK /F on your backup media (as well as your source drive).  This could be invoked as a scheduled job or done manually.  Scheduling CHKDSK is a bit scary in that it could actually create data loss or other problems.  If it were scheduled it’d be important to view the logs to see if problems were found and fixed (event viewer, Windows Logs, Applications).  We HAVE seen successful backups (images were created fine by shadowprotect and other programs) in which the source drive was later found to have corruption.  This corruption was merrily imaged onto the xxxx.spf file located on the High-Rely Classic removable disk media.  When the image was successfully restored, the host machine still wouldn’t boot because the original source boot partition was corrupted (and had been for over 30 days so all the removable drives were equally useless).  So that means for the prior 30 days the server was up and running, but it was sick and had anyone tried to reboot it, it wouldn’t have come up.

Clearly, it is reasonable to check for corruption on any drive periodically, whether or not AMT technology is in use.  I hope this helps.  We think Automatic Mirroring Technology is an awesome way to duplicate your backup!

Tom Hoops

About Tom Hoops

CTO/VP Engineering, Highly Reliable Systems, Inc. View all posts by Tom Hoops →

What do you think?

Your email address will not be published. Required fields are marked *