Tuesday, February 25, 2014

I’ve Got Nothing: The DR Checklist

So what do you have to lose?  If you’ve been reading along with the blog series, I hope you’ve been thinking a bit about ways you can bring your disaster recovery plans to the next level. My first post in the series on what to consider might have gotten you started on some of the items in this list. If you need some ideas of where to go next, or if you happen to be just starting out, here is a even longer list of things you might need.

Disclaimer: I love technology, I think that cloud computing and virtualization are paramount to increasing the speed you can get your data and services back online. But when disaster strikes, you can bet I’m reaching for something on paper to lead the way.  You do not want your recovery plans to hinge on finding the power cable for that dusty laptop that is acting as the offline repository for your documentation. It’s old school, but it works. If you have a better suggestion than multiple copies of printed documentation, please let me know. Until then, finding a ring binder is my Item #0 on the list.  (Okay, Hyper-V Recovery Manager is a pretty cool replacement for paper if you have two locations, but I'd probably still have something printed to check off...)

The Checklist
  1. Backups - I always start at the backups. When your data center is reduced to a pile of rubble the only thing you may have to start with is your backups, everything else supports turning those backups into usable services again. Document out your backup schedule, what servers and data are backed up to what tapes or sets, how often those backups are tested and rotated. Take note if you are backing up whole servers as VMs, or just the data, or both. (If you haven’t yet, read Brian’s post on the value of virtual machines when it comes to disaster recovery.)
  2. Facilities - Where are you and your backups going to come together to work this recovery magic? Your CEO’s garage? A secondary location that’s been predetermined? The Cloud?  List out anything you know about facilities. If you have a hot site or cold site, include the address, phone numbers and access information. (Look at Keith’s blog about using Azure for a recovery location.)
  3. People - Your DR plan should include a list of people who are part of the recovery process. First and foremost, note who has the right to declare a disaster in the first place. You need to know who can and can’t kick off a process that will start with having an entire set of backups delivered to an alternate location.  Also include the contact information for the people you need to successfully complete a recovery - key IT, facilities and department heads might be needed.  Don’t forget to include their backup person.
  4. Support Services - Do you need to order equipment?  Will you need support from a vendor? Include names and numbers of all these services and if possible, include alternatives outside of your immediate area. Your local vendor might not be available if the disaster is widespread like an earthquake or weather incident.
  5. Employee Notification System - How do you plan on sharing information with employees about the status of the company and what services will be available to use?  Your company might already have something in place - maybe a phone hotline or externally hosted emergency website. Make sure you are aware of it and know how you can get updates made to the information.
  6. Diagrams, Configurations and Summaries - Include copies of any diagrams you have for networking and other interconnected systems. You'll be glad you have them for reference even if you don't build your recovery network the same way.
  7. Hardware - Do you have appropriate hardware to recover to? Do you have the networking gear, cables and power to connect everything together and keep it running? You should list out the specifications of the hardware you are using now and what the minimum acceptable replacements would be. Include contact information for where to order hardware from and details about how to pay for equipment. Depending on the type of disaster you are recovering from, your hardware vendor might not be keen on accepting a purchase order or billing you later. If you are looking at Azure as a recovery location, make sure to note what size of compute power would match up.
  8. Step-By-Step Guides - If you’ve started testing your system restores, you should have some guides formed.  If your plans include building servers from the ground up, your guides should include references to the software versions and licensing keys required. When you are running your practice restores, anything that makes you step away from the guide should be noted. In my last disaster recovery book, I broke out the binder in sections, in order of recovery with the step-by-steps and supporting information in each area. (Extra credit if you have PowerShell ready to automate parts of this.)
  9. Software - If a step in your process includes loading software, it needs to be available on physical media. You do not want to have to rely on having a working, high-speed Internet connect to download gigs of software.
  10. Clients - Finally, don’t forget your end users. Your plan should include details about how they will be connecting, what equipment they would be expected to use if the office is not available and how you will initially communicate with them.  Part of your testing should include having a pilot group of users attempt to access your test DR setup so you can improve the instructions they will be provided. Chances are, you’ll be too busy to make individual house calls. (For more, check out Matt’s post on using VDI as a way to protect client data.)
Once you have a first pass gathering of all your disaster recovery items and information, put it all in a container that you can send out to your off-site storage vendor or alternate location. Then when you practice, start with just the box - if you can’t kick off a recovery test with only the contents (no Internet connection and no touching your production systems) improve them and try again.  Granted, if you are using the cloud as part of your plan, make sure you know which parts require Internet access, have a procedure for alternative connectivity and know what parts of your plans would stall while securing that connection.  You won't be able to plan for every contingency, but knowing where parts of the plan can break down makes it easier to justify where to spend money for improvement, or not.

No matter the result of your testing, it will be better than the last time. Go forth and be prepared.

Oh, one more thing, if you live in a geographic area where weather or other "earthly" disasters are probable, please take some time to do some DR planning for your home as well.  I don't care who you work for, if your home and family aren't secure after a disaster you certainly won't be effective at work. Visit www.ready.gov or www.redcross.org/prepare/disaster-safety-library for more information.

This is post part of a 15 part series on Disaster Recovery and Business Continuity planning by the US based Microsoft IT Evangelists. For the full list of articles in this series see the intro post located here: http://mythoughtsonit.com/2014/02/intro-to-series-disaster-recovery-planning-for-i-t-pros/

No comments:

Post a Comment

MS ITPro Evangelists Blogs

More Great Blogs