Increasingly, companies are implementing business continuity plans to ensure the constant availability and support of the company infrastructure and business processes.
But a plan is merely the first step on the road to continuing the business in the event of an interruption, and unless tested regularly quickly becomes out of date. Worse still, out-of-date plans can even engender a false sense of confidence in their fitness for purpose! Therefore, how does a company go about testing and what problems might it encounter?
Testing is the part of business continuity (BC), which verifies, supports, matures and develops the plan. A business continuity plan is no good sitting on a shelf behind the IT director. Plans have to be useful and, more importantly, usable. They must reflect the fact that organizations are dynamic entities where the only constant is change. Regular testing ensures the plan is fit for its purpose and in step with the recovery needs of the business. It is the fire drill of the business continuity world.
Commerzbank for example, was grateful for its regular testing schedule. Last September, it was in the middle of implementing a test program in New York. This involved upgrading a recovery site to perform simultaneous failover, i.e. one of two separate machines holding data is switched off while the other is activated, testing each phase with users as it is implemented. Before a final full-scale test could be performed, the plan had to be invoked on September 11.
Types of test
Firstly companies need to decide the scope of the test. Which individuals, teams or office locations need to be involved in the test? Is the test to recover the entire IT infrastructure or only one system? Is it necessary to test moving an entire office to an alternative work area/recovery facility?
For example, it may be that your call center building is inaccessible due to fire or flood, and your employees have to relocate to your recovery location. But until you have tested this, how do you know when, how or who to inform regarding relocation to the recovery premises? How long it will it take to get everyone there; can they do so using their normal forms of transport; and will they know how to get there or need a map?
The next question is whether it is an internal or external test. Business partners, suppliers or customers will need to be included in some types of test. If you outsource your call center function, for example, how do you ensure access to data held on your server if it went down?
You can ask your business continuity supplier to make the test very close to the real thing. You and your BC team can plan to close down an office, manufacturing plant or data center to test that you could open it up again.
Expect the unexpected
Of course, many disasters happen 'out of hours.' So it can therefore be useful to simulate such events to further check that the people and suppliers included in the plan are there when needed. For instance, will your security be as effective in the middle of a weekend night as during the working week? Do holy days prevent key personnel being available in a crisis? Do your contact lists enable 24x7 access to the people you need? Does everyone have a copy of the plan at home?
Crisis management is an essential part of BC. Senior management should be involved in some tests as their role during a real crisis would be to manage the overall business recovery process, deal with any issues that arise, and maintain internal and external communications. Your BC partner can run consultancy role-play exercises to test management reaction to crises.
The importance of testing people's reactions to crises cannot be emphasized enough. Those responsible for managing the disaster response, without the benefits of prior testing, have been known to panic. Senior managers running round like headless chickens, or adopting a militant management and barking orders at subordinates may be regarded as not prepared to deal with a disaster, neither are they a pretty or confidence-inspiring sight!
Mistakes you can make
Firstly I must stress that in testing and continuity planning, you can never assume anything. Ask yourself - who is going to actually do the recovery? Could those designated to recover applications work from the plan if they did not write it themselves? Users may assume they could take a laptop home if the office was unavailable - but not if it burns down with the laptop inside. The plan should make provision for this.
Many tests exclude the initial process of assessing the extent of the problem and its impact on the business. This then influences the decision on whether or not to invoke disaster recovery procedures and if so, which ones. A business continuity plan is not one large document, but several sets of procedures addressing the recovery needs of different parts of the business.
Execute the test carefully, and ensure that applications you have taken down are fully functional before allowing users to log on. I would also advise that you are honest with yourself and with management if a plan doesn't work exactly as you need. You can then update it accordingly - this is the whole point of testing, as you identify errors and oversights you can correct them, as well as familiarize people with the plan. This ensures the validity of the plan from the outset, as well as ensuring it remains in line with business needs.
You need to conduct plan tests at regular intervals - aim for at least one per year for all systems and business functions covered, either as standalone or integrated tests. Remember to bear the needs of the business in mind - fast-moving environments may need to test as frequently as once every three, six or nine months instead.
Developing the plans
It is imperative to remember that companies change and develop. Are your recovery plans up to date? Have you changed/upgraded your technology since the plan was written and does it reflect this? If not, why not? And if so, have you tested your recovery process since? Have key personnel left the company or moved to other jobs within? Have their replacements been included in the plan? Have backup policies changed and do they affect recovery times or efficiency? Only diligent updates to and regular testing of recovery plans will ensure they develop as your IT and business requires.
If testing becomes too daunting or begins to take up too much time, a BC consultant could run assisted tests for you. Such consultants can work at your own premises or from remote facilities to keep your plans up to date. This promotes confidence that you can rely on them for business recovery in the event of any interruption.
A business continuity plan is not worth the paper it is written on if it has not been tested. The whole point of running a test is that it will not run smoothly - there will be problems, oversights, omissions and more! Testing enables you to identify the shortcomings and update your plan. It also familiarize your personnel with the recovery process, as well as helping to create a spirit of partnership with your recovery provider. Testing promotes confidence and competence - both vital to a successful recovery. Like the Scout Association says - Be Prepared!
Phil Carter is director of SunGard planning solutions, SunGard Availability Services (www.sungard.com/availability).