Highly Reliable Systems: Removable Disk Backup & Recovery


Using High-Rely Netswap with HP DataProtector

By Darren McBride

In this white paper we discuss the features of Netswap that support backup software and make it work as an alternative to conventional tape, cloud, or NAS solutions.

In HP Dataprotector a new device type, called File Library, is available that provides advanced backup-to-disk functionality.  A detailed white paper from HP describing the full software functionality with disk and how to setup and configure it can be found here.  Specifications of the latest 6.1 version of the software can be viewed here.  However, the High-Rely system will work with all older versions of the software that support backup to network disk.  The rest of this paper will discuss features specific to High-Rely products and the HP software.

Removable Drives with High Duty CycleHRC_MediaTrays
High-Rely removable drive backup systems include fully enclosed aluminum trays that are designed to be swapped on a regular basis. We sometimes refer to this as “Highly Removable” to contrast our product with servers or NAS drives that are unprotected raw drives.  Because normal server and NAS RAID array drives are only swapped when drives fail they tend to use unprotected (exposed) raw hard drives with connectors that wear out after 50 plug/unplug cycles.  With our daily swap design the drives can replicate the functionality of tape, or be used as “seed” drives to accelerate backup or restore in cloud backup designs.

Hidden Mirror Drives

The reason High-Rely can support operating systems and backup software like DataProtector Mirroring_Netswap400(which are often designed to use fixed disks) has to do with use of mirrored drive sets.  Both our Netswap and RAIDPac products support this functionality by providing secondary drives, in which the units “hide” the removable mirrored drives from the software.  This allows the software to view the hard drive(s) as permanently connected Network storage, while giving the user the option of swapping drives on a regular basis so that the data set can be transported off-site manually.

Currently Netswaps support up to 10TB drives and RAIDPacs in RAID0 30TB or RAID5 mode 20TB.  Backup jobs should be configured to fit within these modular sizes to maintain easy portability. Jobs larger should be split into multiple source backups.

Disk Sharing Methods

There are 3 ways of sharing hard drives across a network so that the backup software can find it:

1.    CIFS also known as SMB or simply a “Windows Share”.  This is the most common and allows multiple servers to push their backups to the Backup NAS.  Drive letters can be mapped or the Netswap can be accessed using a UNC path such as \\servername\sharename.

2.    NFS is typically used with Linux, Mac, or sometimes with Virtual operating systems such as VMWare

3.    ISCSI – a one to one connection between a server and the High-Rely NAS device.  This can sometimes be a faster way of connecting and can also work around compatibility problems with the other two methods.

Cloud Replication

Any Netswap can be configured to off-site data automatically over an Internet connection. Ports must be open in the firewalls and Bandwidth must be available to deal with the large amount of data. Options for this type of offsite replication can include box to box (where two Netswaps are configured at different locations to replicate from one to one another or bi-directionally) or to popular cloud options such as Amazon S3, Google Cloud Storage, DreamHost DreamCloud, and Dropbox. However, be aware that some of these public cloud options are “block level” storage, meaning that uploaded files cannot be modified in place in the cloud.  Rather they are designed with specific functions called “put”, “get”, or “delete” for your files.  This limited functionality can be important if large files are uploaded which are modified daily (such as with synthetic backup used in backup software like HP Data Protector).  For this reason a box to box replication scenario or a Data Protector replication configuration with a second copy of backup software at the remote site may be a more practical option.

Replication_NetswapPlus

HP DataProtector’s new Distributed File Media Format (DFMF)

To improve speed of backup with Cloud replication HP Data Protector software 6.0 has a new media format is introduced, named distributed file media format (DFMF). This format can only be used with the HP Data Protector software File Library and is by default not enabled. Without this format HP Data Protector Software writes all data and catalog segments into one file. This is done per session; hence each session creates its own file. With the new media format, pure data blocks are written into different files. This is done for each file, bigger than the used block size (default 64 KB); hence for each backed up file, a dedicated file on the File Library is created, which holds the data blocks. If a consolidation session is performed on backups that are all located in the same File Library, the data that will be consolidated is already stored in one or more media files. The new DFMF concept tries to reuse those files; hence instead of copying the data blocks, they are only referred by way of pointers. Therefore consolidation sessions, creating virtual full, do not copy the files hosting the data blocks, instead, the new session only refers to them by way of pointers. Note that only consolidation sessions are using pointers; normal backups, both full and incremental, are always creating new data block files.

Backup speed and Connectivity

The Netswap backup NAS is a series of network connected devices that can be plugged in via either Gigabit Ethernet or 10GigE (optional).  By default the performance of the backup will be limited to the speed of 1 Gigabit Ethernet (approximately 200 Gigabytes per hour of actual data backup) unless 10GigE is utilized, in which case the speed will be bottlenecked by the SATA hard drives or RAIDpacs used inside the backup NAS device.  With 10GigE it is possible to get 200-300 Gigabytes per hour of backup data for standalone High-Rely classic drives and up to 400 Gigabytes per hour for RAIDPacs.  These speeds are highly variable and can depend on source drive speed, network utilization, software used for backup, compression, encryption, and size of files.  Backup times can be radically reduced using synthetic backup discussed more in the next section.

Synthetic backup

HP Data Protector Software A.06.00 introduces an advanced backup solution called synthetic backup. This solution enables you to create synthetic full backups and virtual full backups with an operation called object consolidation.  Synthetic backup is a backup solution that eliminates the need to run regular full backups. Instead, incremental backups are run, and subsequently merged with the full backup into a new, synthetic full backup. This can be repeated indefinitely, with no need to run a full backup again. In terms of restore speed, such a backup is equivalent to a conventional full backup.

Restore from a synthetic full backup is equivalent to restore from a conventional full backup. The following figures present different situations, supposing you need to

restore your data to the latest possible state. In all examples, a full backup and fourHPINcremental

Incremental backups of the backup object exist. The difference is in the use of synthetic backup.

Summary

If backup software supports backup to fixed disk, High-Rely can generally make it work with removable disk by hiding the removable drives from the software and operating system.  This technique means almost any backup software can utilize the additional protection of removable drives.