During a recent Twitter conversation about disaster recovery and business continuity testing, I began to consider how we communicate during a disaster. We do so not with normal communication methods, but more often than not with an interrupting form of communication—one in which constant requests for updates, criticisms, and outright demands for attention are directed at those who are doing the work of recovering a system. During a disaster recovery effort, communication breaks down. Why? Generally, not enough testing has been performed to document communication issues or any other types of issues. How can we improve this communication, or even get the proper people involved, when six feet of snow, water, or mud surrounds our place of work?
Articles Tagged with Disaster Recovery
Recently, we experienced a fairly catastrophic SAN failure: we lost two drives of a RAID-5 array. Needless to say, recovery was time-consuming, but it also pointed out some general issues with many disaster recovery, business continuity, and general architectures involved with virtual environments. Luckily, we were able to start one of the drives, let the hot-spare take over for the second failure, and recover the vast majority of our data. Yes, there was corruption, so that is where our backups came in and the ultimate dependencies for restoration. How do you recover from a catastrophic failure? Do you fail over automatically to a hot-site or cloud environment? Even if you fail over, how do you recover from a catastrophic failure?
VMware has been aggressively building and executing its hybrid cloud vision, extending the cloud outside of the data center. In line with this vision, VMware recently announced an expansion of its VMware vCloud Hybrid Service by adding disaster recovery as one of its offered services. This expansion will put VMware in direct competition with companies like IBM, Sungard AS, Amazon, Rackspace, Zerto, and others in the Recovery as a Service space.
Is automation killing the engineering? When MTV first appeared on the air, the first video it played was, “Video Killed the Radio Star.” Fast forward a few decades and I have to wonder if automation is killing the engineering. In the early days of virtualization the administrators were expected to be proficient in using the command line, and to be honest, if you wanted to really understand how things worked, command line administration was an absolute must-have skill. Virtualization has evolved from those early days. More and more features and services are being added to the infrastructure, such that the need for the vast number of skills required seems to be fading as the technology continues to mature. Looking forward to a time when cloud computing is working to achieve complete and total automation, I have to wonder how administrators will handle the stress of getting issues resolved when automation is not an option.
A major aspect of virtualizing any business critical application is data protection which encompasses not only backup, but disaster recovery, and business continuity. It is imperative that our data be protected. While this is true of all workloads, it becomes a bigger concern when virtualizing business critical applications. Not only do we need backups, but we need to protect the business, which is where business continuity comes into play.
The east coast is experiencing the tail end of a very large storm named ‘Sandy’. We all had plenty of time to prepare for the storm, but did we? Individually, we probably did, but what about our data? Those 24/7 critical processes to allow our customers to view and respond to the data our organizations provide? We were lucky—we had no issues during the storm, but now we await issues during storm clean up. So how do you prepare for such disasters? Do you move to the cloud?