10 Questions You Need to Answer to Survive a Data Disaster
Good business relies on good planning. Anticipating scenarios, detailing responses and understanding consequences is an essential part of your business survival kit. Technology is no different. We all want and hope for the best outcome, but, statistically, we know that trouble may be lurking around the corner.
This is the same reason you likely insure your home and automobile; you hope to never need to file a claim but are very happy you planned and invested in protection when an accident does occur.
In technology-speak, redundancies, high-availability and disaster recovery are insurance policies for your business operations. Understanding what you need in that technology policy is dictated by several factors. If you equip your business with the right technology safeguards, you will be sure to survive a data disaster.
What Constitutes a Data Disaster?
The best way to survive a data disaster it to prepare for the possibility that one could take place. The biggest mistake most companies make is that they wait until a disaster actually happens to figure out what to do next.
When it comes to visualizing a disaster, the obvious immediately come to mind: tornadoes, hurricanes, earthquakes, fires, blizzards, floods. But do we all agree on what a disaster looks like for your business technology operations?
Disasters in technology more often come from either human error in controlling technology or a malfunction of a technology system. This can include hardware, software corruption, malware, ransomware, failure to update and the most common: the human error factor.
Defining the Human Error Factor
The human error factor in technology is defined as something that has been done that was “not intended by the actor,” such as lack of planning, simple keystroke errors that wreak havoc on code, failure to change passwords, accidental deletion of data and malicious activity against your technology.
The human error factor holds the keys that often allow viruses and malware to infect your system and take it for a joyride. Remember, all the technology in the world cannot prevent human error from eventually impacting your business.
People trip over power cables, unplug the wrong one, and sometimes maintenance happens in the right way on the wrong server.
A disgruntled employee could also cause disruption. Don’t allow something like this to shutter your business by hoping for the best with your existing technology.
Consider the Possibilities
Taking a hard look at the possibilities, and carefully considering the impact they could cause, is an important part of any disaster recovery plan. Things such as hardware redundancy, automated off-site backups and geographical separation can go a long way in ensuring your business remains online and profitable.
Of course, the cost is always a factor. While there is certainly a fine balance between playing the odds and paying for something you may never need, are you willing to bet your business on a coin flip possibility?
Two important factors that can help determine the right pieces of the disaster recovery equation for your business are:
- Recovery Time Objective (RTO)
- Recovery Point Objective (RPO)
Once you have established your “catastrophic” point on each element, you can prepare for the possibilities. Let’s explore what each of these means and, more importantly, what they could mean for your business.
Recovery Time Objective (RTO)
Recovery Time Objective (RTO) is the measure of how long your business can be offline before the damages are catastrophic. Factors to consider include the financial impact of lost orders, reduced production capacity, delayed time to market and what is often the most detrimental: business reputation and customer loyalty. The financial fallout of downtime is different for each business.
For example, if you have an informational website that is down for an hour, you’ll likely experience a few complaints with little direct financial loss. However, if an eCommerce retailer’s website goes down for an hour Black Friday or Cyber Monday, that can seriously threaten its bottom line and genuinely challenge business survival.
Case Example #1 – Macy’s Department Stores
Even with its size and volume, Macy’s Department Stores, in both 2016 and 2017, experienced website slowdowns and outages during the Black Friday/Cyber Monday rush.
According to Adobe Insights, 2017 produced a record $6.59 billion in online transactions on Cyber Monday, a 16.8 percent increase from the previous year. Friday brought in another $5.03 billion in online transactions. Mobile sales also for the first time reached $2 billion over a 24-hour period, becoming the “de facto device” for on-the-go shoppers.
What would an hour of downtime do to your bottom line? Two hours?
Case Example #2 – Fidelity Investments
Of course, it’s not always the lost revenue of the event that can have the biggest impact. In February 2018, following a volatile trading day that saw the Dow Jones Industrial average drop 1,175 points, heavy traffic with outages and slowdowns took place at Fidelity Investments.
In a moment of market panic, investors want access to their accounts.
This followed another incident in November 2017 in which Fidelity Investments had a temporary outage that blocked customers from accessing their online accounts. Aside from the lost trading revenue and free trades offered to compensate clients, the impact extended to potential increases in account churn and customer loyalty.
The negative press also could sway potential new customers to other brokerage firms.
How long can your business operations realistically be down before they become a threat to your livelihood?
Recovery Point Objective (RPO)
Recovery Point Objective(RPO) is the measure of how much data you can reasonably lose during a catastrophic event before your ability to do business–and, in some cases, to remain in business–is in question. If that sounds harsh, it’s meant to. The City of Atlanta has become a prime example of how important it is to have a disaster recovery solution in place.
Case Example #3 – City of Atlanta
You may have read in the news in March about the ransomware attack against the City of Atlanta, a metropolis of nearly 6 million people. On March 22, 2018, multiple municipal computer systems were crippled by a massive ransomware attack called SamSam, impacting at least five out of thirteen departments.
The hackers responsible had encrypted files and were demanding $51,000 worth of bitcoin to provide the digital keys necessary to unscramble the files, or they threaten to delete the data if not paid.
Atlanta police had to resort to taking cases notes on paper, and multiple investigative databases were inaccessible.
Case Example #4 – Orleans Parish Civil District Court
On October 25, 2010, the Orleans Parish Clerk of Court’s Land Records Office had a catastrophic system failure on the servers storing all of the conveyance and mortgage records.
In total, 180,000 digital records were impacted. The records were essential for title abstractors, buyers and sellers of real estate, lending institutions, title insurers and closing notaries.
While many of those records were eventually recovered, the index related to those records was completely lost. Unindexed, buyers could not access title insurance or clear title on property required to complete a record a sale. This essentially stopped all land-related transactions in Orleans Parish.
A team of 108 people worked extended hours and weekends to restore the system, costing more than $300,000 in overtime, extra staffing and contractual services.
In the months after, the root cause appears to have come down to a software update from the backup vendor, i365, that had been sent to court’s technology staff and had apparently failed at the time of installation in July 2010, causing the backup agent to not properly run after July 2010.
Have a Tested Backup System
Having a backup system that you have not tested is like having no backup at all. According to the Ponemon Institute’s 2016 Cost of Data Breach Study, of businesses that experienced major data loss, 43 person never reopened and 51 percent were closed within a two-year period. The longer it takes to recover the data, the worse it becomes.
By the 10th day, 93 percent of companies file for bankruptcy within a year.
10 Questions You Need to Answer to Survive a Data Disaster
- What are your key systems and assets?
- What risks does your business face if your core applications are not available for an hour? A day? A week?
- What disasters are you guarding against?
- What is your recovery time objective (RTO)? Is this specific to each application and/or customer-facing system?
- What is your recovery point objective (RPO)? Is this specific for each application and/or customer-facing system?
- Who are the key stakeholders and decision-makers who will need to be involved in any data recovery process? Do they know their roles?
- Do you have a written recovery plan and, if so, does it also meet all compliance objectives?
- Are you currently backing up your data? To an off-site location?
- How long would it take to recreate all your proprietary data? Could you? At what cost?
- When is the last time your organization completed a full recovery test?
These questions should help you outline your data disaster recovery plan and really get a feel for what your organization would do if faced with data vulnerabilities. Be sure to plan ahead so you are always prepared to survive a data disaster.
The good news is Liquid Web can help you address these challenges and survive a data disaster. Contact us for a personal review and analysis of your current technology. The Most Helpful Humans in Hosting are here to help.
This guest blog is part of a Channel Futures sponsorship.