Only 60% of organizations have a plan for disaster recovery in the cloud, and most companies consider it difficult to back up data from the hybrid cloud, from big data applications, and from containers. Namely, it’s hard to back up and store data if you’re not sure where it is — which is a central difficulty with these newer technologies. Considering that up to a third of your data is stored in a cloud environment on average, it may be time to dust off your plans for recovering that data in the event of a catastrophe.
How Do You Lose Data in the Cloud?
It’s a little hard to imagine losing data in the cloud. After all, the cloud is where most of your on-premises information is likely backed up. If most major backup services use the cloud, how does the cloud also become a place you can lose data in?
First, outside of your existing backups, the cloud is a production environment. Over a typical day, developers and engineers will create and delete large volumes of information as they support and expand your product. It’s entirely possible for someone to click the wrong button on the wrong volume and destroy something that shouldn’t be lost. By the same token, it’s also easy to give someone the wrong level of access — giving them the ability to read, copy, and delete data that they shouldn’t be able to manipulate. Lastly, you can misconfigured infrastructure itself. In other words, it can crumble when you need extra capacity.
In addition to human error, there’s also software error. Let’s say that you’ve configured one of your applications to update automatically. Not all patches work well — some may cause your applications to stutter or even fail entirely. In this case, downtime or data loss can be the result.
In less traditional environments, such as containers, data loss can occur when engineers don’t fully understand the limitations of the solution. In Kubernetes environments, for example, using an asynchronous mode of replication may cause database transactions to be committed only to a primary database, leaving a single point of failure. Containers that create persistent application data will permanently lose that data when they fail, as opposed to container images that can be restarted.
Acts of Hackers
Finally, let’s not discount enemy action. If someone steals a cloud password, discovers an unprotected volume, or finds the right vulnerability, they’re free to do whatever they want with the information they find. You need to prepare for the possibility that they can disable backups, encrypt your data, or simply delete it wholesale.
On rare occasions, you may have to worry about cloud outages. Although this is less common with larger providers, backup generators failed during a recent power outage at AWS, which resulted in several EBS servers becoming unrecoverable. Even a service with five nines uptime still has the potential to lose your data completely.
So, How Do You Protect Your Data in the Cloud?
Cloud data protection isn’t necessarily different from data protection on-premises. You still need to do the same four things:
- Identify the data that most critically needs protecting
- Develop and deploy a strategy for protection
- Make redundant backups
- Test your strategy
The problem, of course, is that the architecture and methods are a little different.
First, let’s talk about accidental deletion. This issue is common enough that Google, Microsoft, and Amazon all allow users to flag specific VMs, volumes, and instances so that they can’t be deleted on a whim. In keeping with the workflow above, you need to perform an audit to find mission-critical instances, apply these anti-deletion flags, and then back up these volumes in case someone ignores the flags anyway.
With more complex systems such as Docker, you need to be worried about your methods. For example, you can back up a Docker containers by committing it as an image and saving the image as a .tar file, but this method won’t back up data volumes associated with that container. Here, you need to copy the volume and compress it – the compressed copy will serve as your backup.
It’s important to note, however, that as the popular saying goes, “you don’t have a backup unless you’ve restored data from it.” Unless you set up a test environment to understand how your cloud backups will work in the event of a disaster, your backup strategy is a castle in the air. You need to simulate backup and restore scenarios on a regular basis and update your plans based on the results.
With Device42, it becomes easier for you to evolve a backup strategy. Our application dependency mapping tools help you find everything that’s mission critical, no matter where it lives – on the cloud, in VMs, or in containers. This leads to less uncertainty when it comes to protecting your users and customers – and less time spend restoring when disaster strikes. Get a free demo today!