Azure Storage Service Interruption…Time For Disaster Recovery Plan
Yesterday evening Pacific Standard Time, Azure storage services experienced a service interruption across the United States, Europe and parts of Asia, which impacted multiple cloud services in these regions.
As part of a performance update to Azure Storage, an issue was discovered that resulted in reduced capacity across services utilizing Azure Storage, including Virtual Machines, Visual Studio Online, Websites, Search and other Microsoft services.
Read the whole report on the Azure blog. http://azure.microsoft.com/blog/2014/11/19/update-on-azure-storage-service-interruption/
So what does this outage mean to those thinking about a cloud deployment? Global “interruptions” of this magnitude certainly cannot occur on any regular basis for any cloud provider. Especially if it intends to remain in the cloud business, whether they are Microsoft, Amazon, Google or other. However, as a cloud architect or person responsible for a cloud deployment, you have a responsibility to your customer to have a “Plan B” in your back pocket. Time to plan for disaster recovery in case the worst case scenario actually happens.
What Is A Good Disaster Recovery Plan?
Plan B involves having a documented procedure for recovering data and services in an alternate location in the event of a wide spread outage that impacts a cloud provider’s ability to deliver their service. This plan is important even if you had a highly resilient cloud deployment designed to keep running even in the event of localized outages within a region, availability zone or fault domain.
Data Recovery, Application Recovery, and Client Access
At a high level you should be concerned about three things: Data Recovery, Application Recovery, and Client Access. There are many ways to address these concerns, some more automated than others. Some with a better Recovery Time Objective (RTO) and Recovery Point Objective (RPO) than others.
What Configuration To Beat The Outage?
It was just last week that I blogged about how to create a multisite cluster that stretched between the AWS cloud and the Azure cloud. This type of configuration is just what is needed in the event of an outage of the magnitude that we just experienced yesterday in the Azure cloud.
“Cloud-To-Cloud” Replication Model
Another alternative to the “cloud-to-cloud” replication model is utilizing your own datacenter as a disaster recovery site for cloud deployment. It is advantageous to have physical ownership of your data. But it would mean you are back in the business of managing a datacenter. This can negate some of the benefit of a pure cloud deployment.
If you are not ready to go full on cloud, make use of the cloud as a disaster recovery site. This is probably the easiest and most cost effective way to implement an offsite datacenter for disaster recover. Start taking advantage of what the cloud has to offer without fully committing to moving all workloads into the cloud.
DataKeeper Cluster Edition
The illustrations shown above make use of the host based replication solution called DataKeeper Cluster Edition to build multisite SQL Server clusters. However, DataKeeper can be used to keep any data in sync. Either between different cloud providers or in the hybrid cloud model.
Reminder! Have A Plan B
Microsoft is not alone in dealing with cloud outages as outages have impacted Google, Microsoft, Amazon, DropBox and many others just this year alone. Having a “Plan B” in place is a must have anytime you are relying on any cloud service.
Reproduced with permission from https://clusteringformeremortals.com/2014/11/20/azure-storage-service-interruptiontime-for-plan-b/