The Most Common Data Disaster Recovery Testing Mistakes

Are you expecting a disaster to strike? Have you accepted the fact that the unimaginable could happen to you? If not, then it’s time to start thinking about it. Your data is invaluable, and that’s why Superb has a failover system in place for its data centers. Our cloud servers have a system protecting them that automatically restarts your cloud instance in another part of our network should the server it’s running on experience an issue of some sort. What this means for you is virtually no downtime whatsoever.

Whether your data and/or website are stored in a cloud, shared, virtualized or dedicated server or even on scraps of paper – eh, let’s hope it’s not that last one – you need to have a disaster recovery (DR) plan in place to help you get your hardware, applications and data back up and running again should a power outage, flood, hurricane or any of dozens of other potential disasters strike. Small, medium and large businesses all have data that impact the way they conduct, well, business. Some of it is mission-critical, while other bits of that data is less vital but still significant. Regardless of the importance placed on each individual bit of data your company has stored away, you don’t want to permanently lose any of it, and you likely want to regain access to it as quickly as possible should you ever lose it.

This is where DR plans come into play. Whether you’re hit with an environmental, technical or human-based threat you want to be confident that you can bounce right back from it. You need to formulate your plan to help ensure business continuity, and then you need to test that plan out. Every aspect of the DR plan should be in place, from something as fundamental as having enough clean energy generators in case of a power outage, to enterprise-wide virtualization strategies. Developing a plan is great, but without properly testing it you won’t have any idea as to what the plan’s efficacy is until disaster strikes and you need to deploy said plan. For reasons that should be self-apparent, finding out how effective your disaster recovery plan is or is not in the midst of an actual disaster is not a recommended practice.

The folks over at Tech Target recently asked their Twitter followers “What is the No. 1 pitfall to avoid when doing a disaster recovery test?” They received some interesting and varied responses from both practitioners and experts and put together a top-five lists of no-nos for disaster recovery testing. So what should you avoid when testing? Keep reading to find out.

Ignoring the Boy Scouts of America’s Motto

Any of you who is proudly polishing off an Eagle Scout medal sitting on your desk right now or even those readers who have ever spent so much as a single day in the scouting program knows what we’re talking about here: being prepared. It works in the scouts, and it works in business, too.

Paul Kirvan, an independent continuity advisor and IT auditor, posited to Tech Target that the biggest issue with disaster recovery testing is not being properly prepared for one – or not having the most knowledgeable and crucial staffers ready and involved in the process. IT staffs, whether internal or external, and the rest of a business’ key employees need to work together in testing a DR plan. Figure out what the best way to guarantee that the most mission-critical systems are going to be back online as soon as possible is, and involve all of the key players in the testing of the subsequent plan that you create.

Knowing that vital people are in the know of the DR test will help them all to document how efficiently (or poorly) it works. Depending on your organization’s needs, structure and data infrastructure, your disaster recovery test might be placed on the calendar weeks, months or even up to a year in advance of conducting it. Whenever it’s scheduled, make sure everyone who needs to know about it, knows about it.

Poor Scheduling

As we just discussed, being prepared is a big part of successful disaster recovery testing. The schedule for various tests needs to be carefully planned out or else the entire thing could end up being a huge waste of time. Inattentive test scheduling could result in a plan that’s nothing more than a false veneer of safety. Moreover, lack of online safety as well as negligence in the maintenance of the data center and its equipment can lead to accidents, online and on-site. Both can cause severe losses to the company as well as to employees. However, in some cases, if an employee is physically harmed, they can opt for worker’s compensation by hiring a law firm similar to Goin Law Group. The same couldn’t be said for online data loss and misuse since it can be untraceable.

Tech expert and blogger Jon Toigo explains how such a scenario could play out.

“A test plan is prepared and presented to management so that resources can be pre-allocated for the coming series of test activities,” he says. “Then, discrete plans are made for each of the testing events scheduled for the next 12 months. Each test event is executed per schedule, with recovery tasks and procedures played out in a nonlinear fashion to optimize test time.”

Bottom line: interdependent tests should be conducted separately of one another. If they’re not, they could lead to false results when tests overlap.

Making the First Three Letters of “Assumption” out of You and Me

There’s an old saying about those who assume that goes a little something like…well, you see what we’re driving at here. In any case, when it comes to DR strategies, you don’t want to make any assumptions about either unknown or known service interruption causes. If you do, you could end up paying for it.

The list of possible outage causes is too long for you to risk guessing at which one might be affecting your system. Programming mistakes, acts of god/nature, human errors and technical snafus are some potential reasons for outages, and each of them could mean a zillion different things. Your DR testing won’t improve your planning if it doesn’t involve relying on evidence. Conclusions reached by guesswork may be easy, but they’re lazy and not much help to you. Just because there’s a storm outside doesn’t mean that it’s the cause of your server woes. Continually run tests and then analyze the results for undeniable evidence of the problem. Leave assumptions out of the process.

Failure to Document

So you got yourself fully prepared, you scheduled tests appropriately and you came to evidence-based conclusions, huh? Great! Oh, you did remember to write down all of your experiments and findings and held onto them, right? You better have.

DR plan test runs that are documented in detail with after-action reviews to increase the effectiveness of future tests better prepare your company for success when an actual disaster hits. Standardizing the documentation of testing and the subsequent analysis will make sure that your business learns what does and does not work. Even if the parties involved have perfect memories – and they probably don’t, because almost no human being does – future employees aren’t going to have any historical testing knowledge if it’s not recorded someplace. So write it down!

Setting Yourself up for Success

This one may sound a bit weird on the surface. Usually in business, you want to set yourself up to succeed. Real-world disasters, however, aren’t going to be easy to recover from; they’re going to be difficult. DR testing absolutely must be grounded in real-world complexities, not dumbed down to ensure success.

Make your test as real as possible. If you don’t, you’re wasting your money and resources along with everyone’s time. Your tests should be difficult for your staff to overcome and be as realistic as possible. Otherwise, you shouldn’t even bother running them at all.

Image Source: Emergency Response Planning

Find out more about Nick Santangelo on Google Plus

Loading Facebook Comments ...
Loading Disqus Comments ...

Leave a Reply

Your email address will not be published. Required fields are marked *