If you have not heard of Schrödinger’s cat, It is a thought experiment devised by the Austrian physicist Erwin Schrödinger in 1935. As the experiment goes, if you seal a cat in a box with a vial of poison that can be opened anytime, you won’t know if the cat is alive or dead until you open the box. Thus, until you open the box, the cat is simultaneously dead and alive.
So, what do Cats and Backups have in common? For starters, they both seem to have a plan. Further they both possess a habit of knocking down our most dear belongings in the least of unexpected situations.
Backing up our data is a critical part of any organization’s Disaster Recovery plan! Why? Because Data is precious!! The devastation of watching hours of work disappear before our eyes as our computer crashes, or the power goes out, or due to number of myriad reasons is not something that we look forward to. Backups provide a safety net in the face of unexpected data loss. Justifying the need for data backup within any enterprise today should be a simple task. But determining an organisation’s best backup strategy is not as easy. There are various, software, hardware and cloud options to choose from, combined with a number of suggested policies and procedures. One of the most popular data backup strategies has originated from a creative professional – Peter Krogh, a photographer, rather than from someone working in Information Technology field or Standards Organisation, as one would have expected. The Backup rule that Peter Krogh quoted is known as 3-2-1 Backup Rule. And accordingly, this rule should satisfy the following requirements: –
• 3 Copies of Data – Maintain three copies of data; the original, and at least two copies.
• 2 Different Media – Use two different media types for storage; This helps to fight off any impacts that can be attributed to a specific type of storage media.
• 1 Copy Offsite – Keep one copy offsite; This prevents the possibility of data loss due to a site-specific failure.
The 3-2-1 Backup rule is a revered and time-honoured backup strategy. And it is a rule to live by. But is it enough?!
Let us revisit the infamous incident at Pixar Studios. Back in 1998, when Pixar was nearly a year into releasing Toy Story 2 when the disaster struck. One of the film’s animators, while routinely clearing out files, entered the deletion command rm -rf * at the root directory of Toy Story 2’s project on Pixar’s internal servers. The team started to notice as character models started disappearing from their works that as in progress. They pulled the plug on file servers but realized that 90% of the work from last two years had been gone.
The team was quick to react to this situation by bringing in their tape backups. But back in 1998 when they were using tapes as the backup option, it had an upper limit of 4 GB. Unfortunately, the movie project had grown over 10 GB in size, and the error log was also saved at the end of the tape, rendering all backups useless. They only realized this when they actually attempted to restore the data. Luckily for us all, the movie’s technical director Galyn Susman was able to save the day. She had been working from home, following the recent birth of her child, and thus had a backup copy of the film on her home computer. The personals were able to carry her computer into the office where the team successfully recovered a two-week old backup with almost all the original data – allowing them to resume working and deliver the finished film on schedule.
The Pixar team was able to recover nearly all of the lost assets save for a few recent days of work, allowing the film to proceed. If it has not been for Galyn Susman’s baby boy Eli, we would never have this version of Toy Story 2. Though in reality the offsite backup saved us the story.
This near disaster that the Pixar Animation Studios had to encounter shows us the importance of backups, especially for critical data, and most importantly verifying the authenticity of backed up data. Without that, the loss of data can be catastrophic.
Invoking the Schrödinger’s cat again, let us extrapolate this as following…
Schrodinger’s Backup: The condition of any backup is unknown until a restore is attempted.
Almost all peoples and organizations are running their own Schrödinger’s Backup experiment. They configure the backups like they guess it will work; see a few times the backups running without error and think everything will run smooth for rest of the time. When disaster strikes, they try to restore the data they backed up and realizes to the horror that the data does not restore like they thought it would restore.
So, let us bring in some redundancy into the 3-2-1 Backup rule that we are commonly following…
This takes the 3-2-1 Backup rule as a starting point. And adds two other necessary conditions to ensure recovery with any type of incidents. Thus, we get the new and upgraded rule. This new 3-2-1-1-0 Backup rule should satisfy the following requirements: –
• 3 Copies of Data – Maintain three copies of data; the original, and at least two copies.
• 2 Different Media – Use two different media types for storage; This helps to fight off any impacts that can be attributed to a specific type of storage media.
• 1 Copy Offsite – Keep one copy offsite; This prevents the possibility of data loss due to a site- specific failure.
• 1 Copy being offline, immutable, or air-gapped –
• 0 Errors – Verify that the backed-up data is has no errors
These two additions are critically important for any backup scenarios. Having a copy of backup data that is either offline, or immutable, or air-gapped is an incredibly resilient feature to help ensure data recovery in case of a ransomware event. Having 0 errors upon backing up the data is something that we should start with. Once the backup process completes, the authenticity of backed up data should be verified. If your backup method or application does not support this potentially lifesaving feature, it is time to switch that backup method or application. Thus we see that, even a single modification to your backup strategy can make all the difference in the world. Remember to successfully back up your data, regularly test your backups, and have a strategy ready to restore your data and/or infrastructure in a predefined timely manner. If you just back up your data, hoping you will be able to recover in time, you are betting against Murphy.