How to Achieve a High Level of Data Center Uptime
A data center will not survive unless it can deliver uptime of more than 99%. Businesses can quickly grind to a halt if their website, email systems, databases or ecommerce capabilities cease to function. Even a few seconds’ downtime can have a huge impact on revenue. Long term downtime will trigger considerable losses in profitability and business continuity and have a severe impact on business reputation. Fortunately, there are several effective ways to increase data center uptime and availability:
1. Implement a preventive maintenance strategy
The key to maintaining data center uptime is to prevent downtime before it even occurs – the process of recovery can take significant time and by that stage the damage has already been done.
Preventative maintenance includes regularly scheduled reviews of data center infrastructure to make sure that all power and cooling systems are optimized and to check for the signs of wear and tear that may lead to future equipment failures. Having regularly scheduled reviews allows for the incorporation of maintenance, upgrades and replacements to each system.
A preventative maintenance plan should be based on actual usage with the data center rather than a generic or reactive approach.
The perception is that preventative maintenance is an inessential cost and it is therefore often overlooked. The reality is that spending the money at the initial stage will give you peace of mind and will result in cost savings overall by increasing the longevity of your equipment and preventing unscheduled downtime.
2. Design a proper air management system
Rapid and substantial rises in temperature can cause major power issues within a data center. An efficient air management system is integral to data center design to ensure temperature and humidity stability.
An air management system will reduce heat related processing interruptions or failures. It can also reduce operating costs, reduce initial capital investment and increase the data center’s capacity.
Human error accounts for 60-70% of all data center downtime. Fortunately, much of what is needed to keep systems running and available these days can be done in a lights-out environment. It's now possible to automate patch deployments, updates and any number of other software tasks.
Advances in data center design e.g. open channel busbar need fewer people in the white space therefore reducing the chance of human error.
Effective monitoring systems within the data center can report on the status of the system down to an individual rack. Real time reporting can help to identify any problems early allowing data center managers to remediate them immediately and before any serious issues occur. By catching problems at an early stage systems can be switched over without any noticeable changes to system status. This proactive approach is much more attractive to data center customers than that of a standard reactive response.
5. Modular, not monolithic
As more and more data centers are hosted within a virtual cloud based environment the malfunction of individual pieces of hardware is much less likely to affect availability or be a leading cause of downtime. It is possible however that older pieces of hardware can cause difficulties in over all data center strategy. Having large, monolithic applications causes difficulties so a modular approach to data center design will reduce the likelihood of any downtime caused by equipment issues.
With virtually zero acceptance of downtime in today's world, data centers have to focus their efforts on achieving constant availability. Fortunately, through implementing practices such as efficient monitoring and preventative maintenance strategies, higher levels of uptime are obtainable.