Power Outages Continue to Plague the Data Center18 min read

by | Oct 11, 2023 | Blog

Power availability is largely taken for granted in today’s world. Consumers expect the lights to stay on, the air conditioning to keep running and their phones to always be charged. However, places like California continue to experience rolling black outs during the hot summer months.

Similarly in data centers, grid power is an expectancy. Yet recent surveys from the Uptime Institute show that 55% of data center operators have had an outage at their site in the past three years. The good news is that things are a little better than earlier surveys – but not by much.

“Outage rates may have fallen gently as have their severity levels, said Chris Brown, CTO of the Uptime Institute.

4% of operators admitted to having suffered a severe outage in the past three years. Another 6% said they had experienced a serious outage, 17% classified the incident as significant and 32% as minimal. The cause of these small improvements is unclear. Brown believes it suggests increased vigilance, more investment and a gradual replacement of aging and inefficient data centers. The newest facilities, after all, tend to incorporate the latest efficiency and sustainability features as well as implementing greater levels of redundancy.  

Outage Costs Soar

While there may be slightly lower numbers of outages happening in data centers overall, their repercussions appear worse than ever. Outages not only hit the pocket book in direct costs, they incur hefty indirect costs in ways such as stock market price drops and reputational damage.

“While there is a lower frequency of serious or severe outages, those that do occur are often very expensive,” said Brown.

One reason for higher outage costs is the increased reliance on online services. If Amazon Web Services (AWS) goes down, an awful lot of websites and enterprise systems go down with it.

Uptime Institute found that about half of all impactful outages cost less than $100,000. These costs include everything from outage to full recovery and take into account direct, indirect and reputational costs. 38% of data centers said the price tag ranged between $100,000 to $1 million. 16% of data centers said the overall cost exceeded $1 million.

Why are outages getting more expensive? Inflation is one answer. Chiller costs have risen sharply, for example, as have power system costs. Lead times for equipment have also lengthened of late.

“Lead times are a big issue, especially for older systems,” said Brown. “Engine generators can have a lead time of a year or more. Even standard panel boards can take 28 weeks to arrive.” 

Power Demand

The expectancy of eternal power availability is understandable. Power is vital to everything in the modern world – and especially in the data center. It is no surprise, then, that power challenges continue to be the main cause of significant, serious or severe outages.

“Power is the lifeblood of equipment; a fraction of a second of a power anomaly can impact IT,” said Brown. “So, it will always be the biggest cause as it is so easily interrupted.”

52% of outages are directly related to power and 19% to cooling. Next comes loss of service from a third-party provider at 9%, hardware or software failures at 8%, network problems at 7%, fires or fire suppression systems at 3%, security related issues at 1% and unknown at 1%.

“UPS issues are among the most common causes of IT outages,” said Brown.

Improving Power Resilience

Better technology can boost power resilience and reduce the number of outages. Better transfer switches, more reliable engine generator systems to take up the load and better power management systems will all help. But Brown warns that these systems may not always function as intended. The very tools introduced to lower outages are themselves subject to disruption from time to time.

“We can drop the number of outages due to power, but we won’t get rid of it as a persistent challenge,” he said.

Redundancy is another obvious area to improve. Uptime Institute noted that physical site redundancy levels have been climbing slowly. 37% of data centers have improved redundancy levels over the past five years. The reason the figure isn’t higher is largely down to logistics. New builds have the advantage of being able to design redundancy in from the start. It is a different matter for existing data centers.

“It is difficult to increase redundancy in existing data centers as it is far more costly than designing it in,” said Brown.

The lucky few have enough space for expansion and to change out old gear for newer, more efficient and more redundant systems. But for most data centers, added redundancy entails extending an existing premise, additional buildings or adopting onsite power.

Power Generation Upgrades

Ihab Chaaban, Global Commercial Development Director of GE Vernova, advocates the use of gas turbines and batteries in a hybrid power arrangement as one possible approach to redundancy. By generating all or some of the power onsite (or purely providing backup power to the grid), the data center lowers its risk profile due to grid power outages. 

He recommended data centers trade in aging diesel gensets for compact turbines with high power density. He gave one example of 45 diesel gensets (100 MW of backup power) being replaced by replaced by four GE LMxpress aeroderivative turbines.

“These machines are used to lots of starts and stops, have high reliability and availability and are fast to commission,” said Chaaban. “If batteries are included, the battery kicks in within milliseconds and stays online util the turbine is up and running in about five minutes.”

But such changeouts of generation hardware or building a whole new generating station onsite may be beyond the means of many data centers. Sometimes the cost of greater redundancy is so much that the only viable approach is to hand the business over to outside providers.

Setting the standard for rack power reliability.

With automated soldering from line input to each receptacle,
PowerLok® eliminates all mechanical connections, making it 270%
less likely to fail than rack PDUs with mechanical terminations.

Setting the standard for rack power reliability.

With automated soldering from line input to each receptacle,
PowerLok® eliminates all mechanical connections, making it 270%
less likely to fail than rack PDUs with mechanical terminations.

Drew Robb

Drew Robb

Writing and Editing Consultant and Contractor

Drew Robb has been a full-time professional writer and editor for more than twenty years. He currently works freelance for a number of IT publications, including eSecurity Planet and CIO Insight. He is also the editor-in-chief of an international engineering magazine.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Subscribe to the Upsite Blog

Follow Upsite

Archives

Cooling Capacity Factor (CCF) Reveals Data Center Savings

Learn the importance of calculating your computer room’s CCF by downloading our free Cooling Capacity Factor white paper.

Pin It on Pinterest