In the previous article, we discussed head dissipation in servers and how it is important for system stability. We also explained Allion’s services and assistance towards these issues. In this article, we’ll continue the discussion on how Allion’s consulting team will conduct evaluations, giving examples of issues discovered during the testing process and their results.

There are three key points for heat dissipation in AI servers, namely:

  1. GPU Air Duct: Attempting to use different GPU air duct structures to concentrate the air into the server, enhancing heat dissipation performance.
  2. GPU Tray: Modifying the GPU tray to observe the impact on heat dissipation.
  3. GPU Air Duct: Attempting to close the gap in the CPU air duct to concentrate the airflow for a better heat dissipation performance.

Allion’s professional consulting team will first discuss the situation with our clients. After confirming the structure, they will start to place thermocouples to monitor the temperature. Afterward, the pressurization program will be executed and data will start being collected. During the pressurization, different components of the server will be pressurized, such as the GPU or CPU. The degree of pressurization will also vary from 30% to 100%. Simultaneously, the speed of the fan will be controlled to simulate fan failure, ensuring the server can still maintain heat dissipation performance during unexpected situations.

Allion has collected data on two types of heat dissipation structures for this project. After analysis, it was confirmed that the heat dissipation performance of Structure 1 was better. The figure below shows the results of the test.

It was found that the temperature curve does not peak in the center and the two points were opposite from one another. After discussing with our client once more, we found that it was caused by the backflow of hot air on the sides and the gaps near the PSU. The process is as follows.

Abnormal PSU temperature before adjustment: The temperature of Temperature_2 is lower than the outer temperature of Temperature_1 ➔ Possibly caused by heat accumulation or heat backflow as a result of structural design and other related issues

After adjustment where the PSU temperature was normal: PSU core Temperature_3 > Temperature_2 near the core > Outer Temperature_1

Allion has decades of experience and skills regarding heat dissipation in servers. We can assist clients with completing evaluations of various designs and selecting the best solution in the shortest amount of time. Allion has also built walk-in chambers that withstand various heat loads, perfect for servers that need verification testing for heat dissipation.

  1. 13KW Walk-in Chamber
    • Temperature Range: -20 ℃ ~ 80 ℃
  2. 20KW Walk-in Chamber
    • Temperature Range: -40 ℃ ~ 150 ℃
  3. 65KW Walk-in Chamber
    • Temperature Range: -40 ℃ ~ 90 ℃

Comprehensive Custom Consulting to Improve Server Heat Dissipation

If you have any further needs for testing, verification, or consulting services related to the server ecosystem, please feel free to explore the following services online or contact us through the inquiry form.

In the next article, we’ll share for more effective solutions. Stay tuned!

Allion Labs_Faster, Easier, Better

Inqury Form

Contact Us

 

Technical Highlights