Skip to content

Search the site

The data centre that froze solid and other liquid cooling challenges

"Liquid cooling is still a niche technology. But where it is seen as needed, especially among the hyperscale operators, the demand for skills and equipment far outstrips the supply."

Data centres are known to fail during heat waves. But one facility in Montana had the opposite problem and "froze solid overnight", leading to a six figure loss. 

Rick Bentley is founder of AI surveillance company Cloudastructure and Hydro Hash, which uses hydroelectric power to fuel a crypto mining data centre, said that the problem came from water block cooling and an extremely rapid drop in temperature in Montana’s frosty Glacier Park.

The temperature dropped from -6C to -34C in just over 24 hours, Bentlet says, with frozen car batteries meaning that repair workers could not make it to the site.

He said: "We thought we were ready but, coupled with a power outage, we weren’t."

Liquid cooling is big business, with expected growth from $4.45 billion in 2023 to $39.96 billion in 2033, according to ResearchAndMarkets, with other analysts predicting a similarly meteoric growth rate.

Nvidia has embraced liquid cooling for its DGX supercomputer system. OpenAI has hired Google's former liquid cooling lead to fuel its "thirsty" models, which reportedly drank up enough water to fill a nuclear power station cooling tower during the training of ChatGPT.

READ MORE: Shining a light on the UK's first quantum-secure connection between two data centres

But there are several factors hampering growth at present, cooling experts have told The Stack. These include fears over the skills and up-front expense needed to install and operate liquid-cooled data centres, alongside worries over the lasting business value of generative AI.  

Bentley has learnt the lessons of his liquid cooling disaster, and says water block cooling is only suitable for moderate climates with air-conditioned, sealed buildings with full power backup, Bentley says. He says that antifreeze is useful risk mitigation and says that oil-based systems have their own drawbacks. 

He says: "Unlike water, oil doesn’t freeze: but there are other issues. Oil can be messy. You are sure you won’t have a spill but you will have several leaks/spills.

"Gear will eventually fail. An integrated circuit can go bad, short out, melt the circuit board and start a fire. In oil, this is actually safer than in air – there’s no oxygen to burn. However, as that IC/PCB melts, it pollutes the entire oil immersion system."

He says that the skills required to deal with such problems mean that the adoption of liquid cooling is progressing "rapidly for the smart and daring, slowly for the mainstream". 

A balancing act for data centre operators

The drawbacks of liquid cooling are higher up front capital expenditure and a steep learning curve on running the systems - while the advantages are clear, with lower operational expenditure and data centres running faster on less energy, with more compute in the same space. 

Space is the central issue driving demand for data centres, but many operators are still holding back, says Tim Negris of MOCA Systems.

MOCA Systems makes Touchplan, a software platform used by data centre constructors. Negris has consulted with customers on 200 data centre construction projects and says central trend driving demand is "data centre densification", which means computing equipment becoming more powerful and taking up less physical space. 

He says: "This applies to both processing hardware (CPUs, GPUs and supporting circuitry) as well as connectivity hardware and solid state storage devices. Densification leads to greater cooling requirements in general, as well as the need to bring the cooling closer to the components."

Data centre operators face a choice between upgrading forced air cooling systems or switching to liquid cooling which requires significant retrofitting. 

High maintenance and capital costs have been highlighted as the main challenge hindering adoption of liquid cooling technology, along with a lack of standardisation in cooling technologies.

In colocation data centres, where where equipment, space, and bandwidth are can be rented by retail customers, the business model for operators benefits from densification because more tenants fit into the same space.

READ MORE: Data centres given Critical National Infrastructure designation and promised goverment "protections"

The other factor driving demand for liquid cooling is the switch from CPUs to GPUs to drive generative AI applications, Negris says - but some data centre operators are waiting for hard data about the real-world adoption of generative AI before accepting the costs of liquid-cooled data centres. 

While the explosive growth of data centre construction has been fueled by the anticipation of the increasing interest in Generative AI, the realisation of its business value is lagging behind.  

"This puts data centre developers and operators in a position where they must balance the considerable cost of retrofitting and designing new builds to accommodate liquid cooling against demand that may not grow as quickly as the hype around Generative AI," Negris says. "They worry that they may be betting on a bubble."

This is particularly for operators of existing data centres, as opposed to new builds, with many still opting for a "wait and see" approach. For virgin data centre builds, the energy-efficiency of water cooling may make it a "more actionable choice" for operators, Negris says.

There are also problems that liquid cooling requires new skills beyond those needed for the more well-established technology of air cooling. 

But for providers who have taken the plunge, business is good, Negris says. 

Negris says: "Liquid cooling is still something of a niche technology that is far from commodification. But where it is seen as needed, especially among the hyperscale operators, the demand for skills and equipment far outstrips the supply. The designers, builders, and equipment suppliers who are specialising in liquid cooling have more business than they can handle."

Liquid assets: The cost of cooling

For many data centre operators, the expense and disruption of switching to liquid cooling is slowing adoption, argues James Lupton, CTO at server manufacturer Blackcore Technologies.

This applies even more in highly regulated industries, he believes.

Lupton says: "A data centre is a significant investment and making sweeping changes to infrastructure is costly and disruptive. 

"You can’t just replace thousands of standard racks with new ones that support rack-level liquid cooling, or immersion cooling. This would be very difficult to implement in financial exchange data centres where a level playing field is a key goal, and even regulated."

But over the longer term, the power demands of AI and high performance computing (HPC) will drive up power density, driving demand for the tech, Lupton believes.

Lupton added: "Colos are only supplying 6-10KW to a rack, this is no longer enough for modern systems that can draw 2-3 KW+ in 2U of rack space. This leads to half-empty racks that can’t support more systems.

"These trends are going to force data centres to adopt any technology that can reduce the power burden on not only their own infrastructure, but also the local infrastructure that supplies them."

Join peers following The Stack on LinkedIn


Latest