The Business is the Dog and it Wags the IT Disaster Recovery Tail

Michael R. Galin, Director – Risk Management, TELUS
280
472
104

Michael R. Galin, Director – Risk Management, TELUS

The toughest message I ever have to deliver to IT professionals is this: Business IT exists only to support business processes. IT Disaster Recovery (DR) is a component of business continuity planning, so your IT recovery must serve and support the business recovery. Simply stated, if it is not required for business recovery, do not invest time, money, or resources in its speedy recovery.

I have asked many DR Managers how they decide which applications, systems, databases, libraries, to protect and recover. The answer is often, “We just know”. No matter how qualified it is, “We just know” is an opinion. Executives are naturally reluctant to invest in opinions (except perhaps their own).

A business impact analysis (BIA) should guide business continuity planning and it should guide DR as well. You would not begin a course of medication without first undergoing a thorough examination. For the same reasons, you should not decide on DR strategies without first conducting a BIA. Similar to a medical examination, a BIA uncovers essential dependencies and tells you which risks need addressing. You then decide on treatment options. DR decisions need to be tied to business objectives, and risk acceptance decisions need to be made by the appropriate risk owners within the business.

 The BIA should inventory business process dependencies and it should perform a risk assessment. The inventory tells you the minimum number and type of “things” you need to recover (dependencies) and the risk assessment tells you how long you can wait to recover them before unacceptable losses are incurred. The business will decide what is unacceptable.

 Business continuity requires the business to find ways to sustain or restore time-critical processes in the absence of their regular dependencies 

The BIA should also identify how each business process uses your IT services, including telephony. This data will provide you with a very detailed usage profile to form a baseline. You will know who uses what, when they use it, and what their recovery time objectives (RTOs) are. From that, you can determine your base service levels and IT RTOs and recovery point objectives (RPOs). Not only determine them, but justify them with data.

If there are multiple processes using an application, you will be able to compare their requirements against the application’s RTO in a spreadsheet. In fact, I have created spreadsheets that compare the RTOs of thousands of applications against the RTOs of hundreds of business processes. The more data the better. Apply some conditional formatting to visually highlight conflicts and you will easily see the entire picture. This will also allow you to do some modelling. If you change the RTO value for an application, you will see how it affects the various users of that application. If you have established policies for DR support in relation to user RTOs, you will be able to see your “score” for each application. It creates a very valuable dashboard. This is the type of data people need to make decisions.

Be prepared. Not everyone will get what they want from you. Your IT recovery may not support their process recovery needs. That is where business decisions have to be made. Business continuity is not about listing everything the business needs and duplicating it or making it fully available. Business continuity requires the business to find ways to sustain or restore time-critical processes in the absence of their regular dependencies. The objective is to economically recover, replace, work-around, or substitute the processes’ dependencies within a timeframe that is acceptable. This is true for any type of dependency, whether IT, facilities, supply chain, human capital, or equipment. For example, if your RTO for a particular application is 24 hours, but a business process which uses that application has an RTO of 18 hours, their business continuity plan should include a strategy to work without that application until it is available. This is where the science of the BIA meets the art of the business continuity plan. It requires creativity and resourcefulness. It is basic risk management, but at least you will have data to support the decisions.

If a work-around is not possible, weigh the cost of improving the application RTO against the cost of the affected process being down for an additional 6 hours (the residual gap). Factor in your corporate risk appetite and any relevant policies for guidance. BIA data should be able to identify thecost of a disruption (delayed process recovery) and how that cost changes over time, i.e. the process’s time-criticality. Labels such as “Critical” or “non-critical” do not tell you what you need to know; it’s all about time-criticality.

The residual gaps that cannot be remedied with business continuity plans are risks that need to be managed. There are only four things you can do with any risk. Pick one, or mix and match.

• Accept it: “We can live with that”.
• Eliminate it: “It won’t happen if we ...”
• Mitigate it: “It won’t be as bad if we ...”
• Transfer it: “Let’s outsource or buy insurance”.

Keep in mind, some risks are acceptable; especially when the risk treatment can be more expensive than the disruption itself. The trick is to be able to provide data to the decision makers and risk owners. That data must come from an objective analysis such as a BIA. The dog wags the tail.

Read Also

Disaster Recovery: A Continuous Journey

Mathew Beall, VP of Infrastructure, First American Financial Corporation

Crisis and Incident Management for the 21st Century

Louis Grosskopf, General Manager, Business Continuity Software, Sungard Availability Services

Where Do You Add Your Armor?

Michael Thompson, Director of Disaster Recovery & Business Continuity, Koch Business Solutions