One of the worst phrases any IT professional can hear? "The system is down." A system outage of any duration is going to mean a day of headaches, calls from every person whose job relies on IT support (basically, everyone), and hours spent troubleshooting, recovering data, and reporting on what went wrong.
According to the Ponemon Cost of Data Center Outages report, sponsored by Vertiv, the average cost of a data center outage is more than $740,000. That’s almost $9,000 per minute! While the study reported many causes of data center downtime--from support system and IT equipment failure, to cybercrime, to human error – there are some common best practices found in the data centers that had fewer and shorter outages.
If your data center customer is looking for ways to reduce the risk of downtime, consider the following:
- Standardize and Automate Security Management
Use console servers to provide secure, remote access to servers to simplify patch management and provide early detection of attacks.
- Monitor UPS Batteries
Batteries are the weak link in the UPS system. Use remote battery monitoring to identify battery problems before they impact operations.
- Use Intelligent Thermal Controls with Cooling Units
These controls improve protection by monitoring component data points, providing unit-to-unit communications, matching airflow and capacity to room loads, automating self-healing routines, providing faster restarts, and preventing hot/cold air mixing during low load conditions.
- Perform Preventive Maintenance
An increase in the number of annual preventive maintenance visits correlates directly with an increase in UPS MTBF. Going from zero to one preventive maintenance visit a year creates a 10x improvement; going from zero to two visits a year creates a 23x improvement. This applies even at the network edge, where these services can help avoid unplanned downtime of smaller, business critical UPS systems.
- Strengthen Policies and Training
Make sure the EPO button is clearly labeled and shielded from accidental shut down. Document and communicate policies and conduct regular training.
To learn more about data center downtime and how to minimize risk, visit www.VertivCo.com/Benchmarks
Jessica Kaiser is Director, Global Partner Programs, for Vertiv.
Guest blogs such as this one are published monthly and are part of The VAR Guy's annual platinum sponsorship.