Thermal management is (literally) a hot-button topic in electric vehicle design as we’ve discussed in a past article. However, this design challenge is equally pressing in computing applications as well—especially as 5G and machine learning further mature.
These high-performance computing (HPC) applications—from small edge computing devices to data centers—require vast amounts of data processing at high speeds. Together, these two factors equal heat dissipation. As a testament to this reality, some companies like Microsoft have even sent data centers to the bottom of the ocean to beat the heat.
For two years, Microsoft’s Project Natick stationed a data center on the seabed of Scotland’s Orkney Islands. Image used courtesy of Microsoft
Poor thermal management can lead to slower computing capabilities, processing errors, and the risk of hardware failure. What exactly are companies doing to keep large and small computing systems cool?
How Do Companies Keep Computing Cool?
Companies use a number of methods to manage heat in computing applications. Some of the main methods use either air or a cooling liquid to transfer heat from hot spots away from processing units.
Below are some common cooling methodologies. Note that these are just a few of many tactics used by designers of thermal management systems, and they can also be used in conjunction with one another on an as-needed basis.
Computer Room Air Conditioning and Air Handling
Computer room air conditioning and air handling (CRAC and CRAH) use mechanical refrigeration to keep equipment cool. However, these systems consume massive amounts of energy and are not equipped to cool high-density server racks.
Rendering of a CRAC and CRAH system. Image used courtesy of AMAX
In-row cooling refers to a system that uses cool air or chilled water to cool units between two rows of server racks.
Rear Door Heat Exchangers
For high-density computation, companies will often turn to rear-door heat exchangers (RHDx), which use radiator-like doors with tubes of chilled water. These tubes are attached to the back of racks to transfer heat away from racks to the door.
Depiction of RHDx. Image used courtesy of AMAX
A cold plate is a liquid-cooled module directly on top of a CPU. They operate when coolant flows through pipes to the cold plate.
In a liquid-immersion cooling system, all equipment is placed in a bath of dielectric fluid that absorbs dissipated heat. The liquid used in this system is both non-flammable and non-conductive.
Combining a RHDx and Liquid-cooling Cold Plate
Two companies that paired with one another to create a custom thermal management system for products in data centers are AMAX and CoolIT Systems. According to the press release, the two companies have partnered together to meet a customer’s need for thermal management of 2U four-node high-density servers on a rack.
The resulting solution uses a rear door heat exchanger and a liquid cooling cold plate. According to AMAX, these two methods used together resulted in 80% heat capture while also reducing fan speeds by 50% from their previous workload.
In addition to saving energy, this cooling method allowed the data center to make the most of its space. Dr. Rene Meyer, the VP of technology at AMAX, even claims that the methods allow benefitting companies to house 60 nodes on a single rack while maintaining high stability and performance.
A Two-phase Immersion Liquid Cool System
Another company innovating thermal management in the computing realm is Liquidstack, a startup specializing in liquid cooling technology.
Liquidstack is looking to advance a two-phase immersion liquid cooling system in which all electronics are completely submerged in a dielectric fluid, negating the need for heat sinks or fans. The chip temperature rises until the fluid begins to boil. Then, heat transfer occurs in two phases: first, hot vapor rises. Then, the vapor condenses on a specialized coil.
Liquidstack’s immersion cooling method. Image used courtesy of Liquidstack
Condensed fluid falls back into the tank, conserving it to be used again. The act of boiling causes automatic convection and higher heat rejection capacity. Since this technique is a fully passive process, it doesn’t require any pumps.
When compared with air cooling, the Liquidstack solution is said to:
- Reject 21 times more heat per IT rack
- Cut energy costs by 41%
- Reduce whitespace for computer infrastructure by 60%
According to the company, this is all accomplished without consuming any water for outside heat rejection as well.
The two-phase immersion cooling technology has already paid off in past projects that Liquidstack has been a part of. In 2014, Liquidstack, called Allied Control at the time, built the so-called “most efficient data center in the world.” The 500 kW center located in the hot and humid Hong Kong climate saved upwards of 95% in energy consumption compared to its previous air cooling solution.
Integrated Heat Sinks for Edge Computing Devices
While thermal management solutions are essential for all computing applications, they are especially important in edge computing.
Cooling systems for edge devices often integrate heat sinks. One specific design includes vapor pipes, which are flat heat pipes with high thermal conductance that efficiently spread heat. These systems integrate with a heat sink to take heat away from the chip effectively—a necessity in edge computing, where processors may experience hot spots.
NVIDIA had a similar solution for its Jetson AGX Xavier device.
NVIDIA’s Jetson AGX Xavier with an integrated heat sink. Image used courtesy of NVIDIA
Normally, this device has an embedded thermal plate, but for more demanding applications, it uses an integrated heat sink with heat pipes. The transfer plate helps heat move from the device to the heat pipes and then to the heat sink to be drawn away from the device.
A Passive Approach to Heat Removal
Advanced Thermal Solutions cites Arrow’s SAM Car as an example of heat management for IoT devices in harsh environments. The company designed an aluminum enclosure with cutouts and pockets connecting to the components of the PCB, effectively acting as a heat sink. This two-way solution provides sufficient thermal management while also protecting all internal parts.
The internal (left) and top (right) views of SAM Car’s heat sink plus chassis design. Image (modified) used courtesy of Arrow and Advanced Thermal Solutions Inc.
Advanced Thermal Solutions Inc. also recommends air cooling options for edge computing processors with varying power dissipation.
Recommended air cooling techniques for different power consumption requirements. Image used courtesy of Advanced Thermal Solutions Inc.
Thermal solutions begin with basic heat sinks and chassis that passively remove heat from electronic components. However, if the thermal requirements are more stringent, the recommendation is to turn to active solutions, which will blow air throughout the device to remove heat. Alternatively, a passive option requires a more intricate and design-specific heat sink to more delicately remove heat from specific hot spots.
New Approaches to Beat the Heat
As computing applications grow more complex, design teams must produce support systems to accommodate the vast amounts of data processing in harsh environments. Thermal management is one important consideration in advancing these systems.
Startups like LiquidStack and computing giants like NVIDIA alike are now mixing and matching traditional cooling methods to accommodate the massive data processing—and heat—considerations of modern computing systems.