Have you heard of the General Data Protection Regulation – GDPR? Do you do business in Europe or with European citizens?  The GDPR regulation is intended to protect the privacy of individuals and has wide ranging implications for technology companies and data backup.  After four years of preparation and debate the GDPR was finally approved by the EU Parliament on 14 April 2016. Enforcement date: 25 May 2018 – at which time those organizations in non-compliance may face heavy fines.

GDPR was designed to harmonize data privacy laws across Europe, to protect and empower all EU citizens data privacy and to reshape the way organizations across the region approach data privacy.  The GDPR creates an EU-wide set of standards for the protection of digital personal data relating to online or real-world behavior for EU internet users. Importantly, these standards apply to the personal data of EU internet users regardless of the location of the entity holding their data.

There are a variety of previsions but one we will discuss here is the so called “Right to be Forgotten” or RTBF in which a user can demand his personal information is removed   The regulations states:

Data subjects have the right to request the controller to erase his or her personal data without undue delay where: the data is no longer necessary for the purposes collected; the data subject withdraws consent; or the data subject objects to data processing Where the controller has made the data public, the controller shall take reasonable steps to inform the controller processing that data of the erasure request.

Let’s discuss this requirement in the context of backup.  Many backups are “image based” in which files are added to a large database structure.  This “wad” of files may take the form of anything from a .zip to a virtual machine image (VMDK, VDI, VHD etc).  The central question is:

If a user requests his data be deleted, does that also apply to all backups?

The reason this question is key is because in most cases it’s impossible to delete a file (or really a record inside of that file) from an image based data store without corrupting the rest of the data.  Since many backups are image based, not record based, even if you could identify the blocks pertaining to a given person inside a database stored inside a file or series of files, deleting those blocks would corrupt the rest of the backup.  And the idea that you would restore every copy of the hard drive you have, delete the files or records associated with the person in question, then re-backup the drive is difficult to contemplate. Ironically, since it is also risky (you could accidentally corrupt the backups) you violate another part of the GDPR, which is the requirement to be able to safely restore personal data that was deleted or corrupted.

One option is to go back to doing backup at a file level using a product like RNAS with High-Sync, but it’s unlikely that will be a choice for everyone.  Even then, deleting one person’s record inside a structured relational database has similar problems as discussed above.  Like the HIPAA law in the United States, much speculation about what the regulation means and how to comply with it has been written.  With the UK leaving the Union, the regulation and who enforces it is mired in politics.  Time will tell how it impacts your backup policies.