How to Create a Better Disaster Recovery Plan

Bob Violino November 27, 2012
How to Create a Better Disaster Recovery Plan
Cloud services, virtualization, mobile devices, and social networking can keep your business going when catastrophes hit. Here’s how these tech trends can help you better your DR and BCP planning.
As we've seen in recent years, natural disasters can lead to long-term downtime. Because earthquakes, hurricanes, snow storms, or other events can put datacenters and other corporate facilities out of commission for a while, it's vital that companies have in place a comprehensive disaster recovery (DR) plan. DR is a subset of business continuity (BC), and like BC, it's being influenced by some of the key trends in the IT industry, foremost among them: Cloud services, server and desktop virtualization, the proliferation of mobile devices, and the growing popularity of social networking as a business tool. 
These trends are forcing many organizations to rethink how they plan, test, and execute their DR strategies. IT and security executives need to consider how these developments can best be leveraged so that they improve—rather than complicate, DR efforts.
 Cloud Services
As organizations use more internal and external cloud services, they're finding that these resources can become part of a disaster recovery strategy.  Marist College in New York, provides numerous private cloud services to internal users and customers. It also hosts services for 17 school districts and large enterprise clients. "The cloud configuration allows us to perform software upgrades across the multiple tenant systems quickly, easily and without disruptions," says Bill Thirsk, vice president of IT and CIO at the college.
"Because our storage is virtualized, we can replicate data across SANs that we have placed strategically on our campus in numerous locations and in our datacenter. A loss of a SAN means only that production operations switch over to another."
Because Marist can perform server-level backups across partitions, it can move data from one server platform to another should an event occur, Thirsk says.
There's big potential value in cloud-based DR services, says Rachel Dines, senior analyst, infrastructure and operations, at Forrester Research. Cloud-based DR has the potential to give companies lower costs yet faster recovery, with easier testing and more flexible contracts, Dines says.
In a 2012 report from Forrester, the firm says cloud-based DR threatens to shake legacy approaches and offer a viable alternative to organizations that previously couldn't afford to implement disaster recovery or found it to be a burdensome task.
Perhaps the biggest downside to the cloud from the standpoint of DR are concerns surrounding security and privacy management. "You still see with some major events, such as the lightning strike in Dublin [in 2011] that took out the cloud services of Amazon and Microsoft, that there can be some temporary loss of service," says John Morency, research VP at research firm Gartner. "The cloud shouldn't be considered 100 percent foolproof. If organizations do need that 100 percent availability guaranteed they need to put some serious thought into what they need to develop for contingencies."
A growing number of larger companies with complex IT infrastructures are putting in private clouds and using these as part of their disaster recovery strategies, rather than relying on public cloud services, Morency says. "They worry about being left out in the cold during a disaster" if service providers are not able to provide service, he says.
Morency notes that this is only true in the case of DR subscription services that provide floor space and actual equipment at a specific geographical location. "Given the more distributed and virtual nature of public clouds, this is much less of an issue," he says.
What the cloud has done for traditional disaster recovery service providers is making testing of their backup capabilities more flexible and less costly, Morency says.

For many organizations, server virtualization has become a key component of the disaster recovery  strategy because it enables greater flexibility with computing resources. "Virtualization has the potential to speed up the implementation of a DR strategy and the actual recovery in a disaster," says Ariel Silverstone, an independent information security consultant and former CISO of Expedia.
"It also has the ability to make disaster recovery more of an IT function rather than a corporate audit-type function," Silverstone says. "If you have the right policies and processes in place, [with virtualization] disaster recovery can become part of automatically deploying any server."
For Teradyne, a supplier of test equipment for electronic systems, virtualization has been an enabler for a much improved DR capability, says Chuck Ciali, CIO.
"We have leveraged virtualization for DR significantly," Ciali says. Using virtualization technology, Teradyne can seamlessly fail over to redundant blade servers in the case of hardware problems. It can also use the technology to move workloads from its commercial datacenter to its research and development datacenter in case of disasters.
"This has taken our recovery time from weeks [or] days under our former tape-based model to hours for critical workloads," and saves $300,000 (about Rs 1.6 crore) per year in DR contract services, Ciali says.
Marist College has deployed virtualization, and one of the benefits is avoiding system unavailability. "We do all we can to avoid any event that would cause users dissatisfaction, loss of access or loss of functionality," Thirsk says. "To do so, we utilize massive virtualization of our processors, our network topology and our storage."
Because Marist IT can now provide a virtual server and a virtual network, as well as spin out storage, "our systems assurance activities move along at a very rapid rate," Thirsk says.
"If at any point of testing something goes horribly wrong, we can decide to trash it and start over or continue forward, all without much trouble at all on the system side."
On the whole, server virtualization has made DR a lot easier, Dines says. "Because virtual machines are much more portable than physical machines and they can be easily booted on disparate hardware, a lot of companies are using virtualization as a critical piece of their recovery efforts," she says.
There are lots of offerings in the market that can perform tasks such as automating rapid virtual machine rebooting, replicating virtual machines at the hypervisor layer with heterogeneous storage, and turning backups of physical or virtual machines into bootable virtual machines, Dines says.
"Ultimately, virtualization means companies can get a faster RTO [recovery time objective] for less money," she says.
On the downside, the popularity of virtualization has led to virtual machine sprawl at many organizations, which can make DR more complex. "Companies have the [virtualization] structure in place that gives them the ability to create many more images, including some they do not even know about or plan for," Silverstone says. "And they can do so very quickly."
Mobile Devices 

From a disaster recovery standpoint, the growing use of mobile devices such as smartphones and tablets facilitates the continuation of IT operations and business processes even after a disaster strikes.
"People will carry their mobile devices with them," says George Muller, vice president, sales planning, supply chain and IT at Imperial Sugar, a processor and marketer of refined sugar.
"I might not carry my laptop wherever I go, but if all of a sudden we've got a disaster I've probably got my BlackBerry in my shirt pocket. Anything that facilitates connectivity in a ubiquitous way is a plus."
One of the positive impacts of the prevalence of mobile devices is that it gives people a greater ability to work remotely and communicate using their devices in an emergency, says Malcolm Harkins, vice president of the IT group and CISO at microprocessor manufacturer Intel.
But mobile device proliferation has also made disaster recovery slightly more complex, Dines says. "Along with mobile devices comes more datacenter infrastructure, such as mobile device management and [products] such as the BlackBerry Enterprise Server, which are often very critical," she says. "This becomes one more system that must be planned for and properly protected."
Another possible negative with mobility in a disaster recovery scenario is that some critical enterprise applications, such as payroll, might not be available for mobile devices, Silverstone says.
Harkins notes that there are potential security risks, such as non-encrypted mobile devices being lost or stolen, and unauthorized access to corporate networks from these devices. But these risks can be overcome by the ability to wipe out data on devices remotely over the Internet.
Social networking

Like mobile devices, social networking gives people another way to stay in contact during or after a disaster.
"We've seen instances such as a couple of years ago when we had major snow storms on the East Coast and a lot of businesses shut down and employees kept in touch with each other via Facebook and Twitter vs. e-mail," Morency says.
In some cases it might take days or weeks for a corporate datacenter to recover after a disaster. And if the company is relying on internal e-mail systems that might put e-mail service out of commission, Morency says.
"Assuming that either public or wireless networks are still available you can now be using social media to communicate, as an alternative to in-house e-mail which may not be available," Morency says.
"If you're using a service like Gmail than it's less of an issue. But if you're using an Exchange-based internal e-mail or directory services, then social media may be a more available alternative."
During a recent disaster test that Marist College performed, "we were curious to see how social networking would be used in case of an actual event," Thirsk says. One early morning the IT department launched an unannounced disaster drill. "While we had warned staff we would be doing this, they had no idea how real we were going to make it," he says.
First, Thirsk sent a message that the college was experiencing a massive system failure. Due to building conditions, staffers could not report to their workplace or to the datacenter. "We shut down our enterprise communications systems and then watched how the staff responded," Thirsk says.
Managers quickly began communicating to their staff via outside e-mail accounts, chat rooms, Facebook, and Twitter. "They even found my personal e-mail account off campus and began messaging me,"  Thirsk says.
In a matter of 20 minutes, all staff had reported to a command center in the campus library, where they were tasked with performing a number of system checks, verifications and processes. "All of this activity occurred using alternate communications methods," Thirsk says. "We documented this exercise and now use it as part of our plan."
Forrester says there are several reasons why social networking should play a role in an emergency communications strategy. For one thing, social technology adoption is increasing, and a greater portion of employees and customers have continuous access to social sites such as Twitter and Facebook.
In addition, social channels are essentially free. It costs very little to set up a Facebook, Twitter, or Yammer profile, recruit followers, and send out status updates.
Social media sites can also facilitate mass communication with external parties, the firm says. Typically, during a crisis immediate communication is limited to internal staff. However, companies should also plan for situations that call for communication with partners, customers, public officials, and the public at large. Social media sites make it easy to establish these external connections.
Finally, the environment of social discussions provides mass mobilization and situational awareness. The value of social networking sites offers unique advantages in the crisis communications arena, Forrester says.