A Standardized Approach for Better Disaster Recovery planning

John J. Germain, VP, Infrastructure & Security Services, Xylem
John J. Germain, VP, Infrastructure & Security Services, Xylem

John J. Germain, VP, Infrastructure & Security Services, Xylem

Disaster Recovery planning is a difficult and tedious endeavor that often feels like you never quite achieve the full benefit of the effort you put into it. And because it is so daunting, it often gets pushed back on the priority list, like way back behind things like cleaning up that wiring closet or doing your expense reports. But the reality is if you don’t have a solid plan when disaster strikes, the ramifications could also be a disaster to your career.

“The better approach is to focus less on the type of potential disasters and more on what critical functions the business can no longer perform”

I have tried to make the effort as painless as possible for my team by standardizing our approach and by not trying to boil the ocean all at once. Below is how we are executing in general terms and although success is hard to measure until lightning strikes, it has been an effective way for us to establish our program. To be honest, this is not rocket surgery. Just a lot of hard work and attention to detail held together by strong governance and a solid framework.

Three Questions to answer before hunkering down and writing plans:

1. Who owns disaster recovery? The IT function might seem like the obvious choice, and maybe it is, but it is also important to consider that the people running the business will need to be involved as well. IT may not have the intimate knowledge of how the business operates, how revenue is generated or what systems and services are most critical. This information is vital when determining recovery priority.

2. What do you expect to get out of the effort? You only get out of it what you put in and it is important that you decide your goals upfront. If all you want to do is check a box for audit, then you may not end up with plans that deliver if and when needed. DR planning is hard, takes a lot of time and in the end you hope that you never have to use them. As an organization you have to decide what you’re true intentions are and then dedicate the appropriate resources.

3. How do you define a disaster? A difficult part of DR Planning is defining what constitutes a disaster or maybe better put, trying to account for every type of calamity. In my opinion the better approach is to focus less on the type of potential disasters and more on what critical functions the business can no longer perform (shipping product, communicating with customers, closing the financial books, maintaining web site presence, etc.) and then plan for those scenarios.

“The better approach is to focus less on the type of potential disasters and more on what critical functions the business can no longer perform”

The answers to the above can help establish your DR governance, which is an important first step. Policy, Standards and a few templates can providea consistent approach to the program. Following a good framework can go a long way towards making this easier. No need to re-invent the wheel here. ISO22301, ISACA ITAF, NISP SP 800-34, NFPA 1600 are just a few places to start. The important point here is that a well-defined program with well-defined objectives will go a long way towards making the journey easier.

After the above is agreed upon, I’ve broken down the approach into five areas:

Asset Inventory- Like I said, this is hard work, and creating and maintaining an accurate inventory of your IT assets is probably the hardest part. There are of course other benefits of having this inventory and well worth the effort if you can do it effectively.

Business Impact Analysis- key in identifying what t he critical IT systems and services are. This is where the non-IT people are needed most. Don’t start a DR plan without a BIA.

DR Plan- It is hard to include everything in a comprehensive plan, but the more detailed the plan, the better the results. However, don’t strive for perfection in the first version; instead focus on the most critical resources. The plan will evolve over time and will be refined as it is tested both in tabletops and real events.

Testing- There are big benefits in exercising the plan to work out any bugs and validate the plan elements. Try not to have an actual disaster be the first time the plan is used. We select a sub-set of elements of the plan to test each year, but if you can arrange it, a full test can be fun for the whole family.

Refresh- Another challenge with DR is keeping the plans fresh. This is not a one and done type effort. Plans should be reviewed at least annually. The BIA should also be reviewed annually to account for changes to the business and technology.

For organizations with multiple sites, you also need to spend time on prioritizing the criticality of the sites. This can be done by looking at the revenue generated by the sites, the systems and services hosted at the sites (and their importance to the overall organization) or even based on where the most important customers are located. You should consider physically visiting the most critical sites to meet with the key staff at these locations. Include the operations manager, the local finance leader, HR leader and physical security leader if available. As noted above, these are the people who know how the site operates and will provide the detail needed to form the BIA and DR elements.

Another key consideration is the people who are needed to execute the plan. Gartner says to focus on roles and not individual names. Make sure plans reflect these roles and the associated responsibilities.

In the end sustaining the plans will become routine and history tells us that inevitably they will be needed. I could re-enforce this with some cool quotations like a stitch in time saves nine, or hope for the best, but plan for the worst, but I won’t.

Read Also


Using "The Box" for Disaster Recovery Planning

Eric J. Satterly, Vice Provost for Information Technology
Disaster Recovery: A Continuous Journey

Disaster Recovery: A Continuous Journey

Mathew Beall, VP of Infrastructure, First American Financial Corporation
Crisis and Incident Management for the 21st Century

Crisis and Incident Management for the 21st Century

Louis Grosskopf, General Manager, Business Continuity Software, Sungard Availability Services