Nvidia H100 debuts at MLPerf with record-breaking test results

The MLCommons community released the latest MLPerf 2.1 benchmark test results. The new round of benchmark tests has nearly 5,300 performance results and 2,400 power consumption measurement results, which are 1.37 times and 1.09 times higher than the previous round, respectively, and the scope of application of MLPerf is further expanded. Alibaba, ASUS, Azure, Biren Technology, Dell, Fujitsu, Gigabyte, H3C, HPE, Inspur, Intel, Krai, Lenovo, Moffett, Nettrix, Neural Magic, NVIDIA, OctoML, Qualcomm, SAPEON, and Supermicro are all tested in this round contributors. Among them, NVIDIA still performed well, taking H100 to participate in the MLPerf test for the first time and setting a world record in all workloads.

NVIDIA released the H100 GPU based on the new architecture NVIDIA Hopper in March this year, which achieved an order of magnitude performance leap compared to the NVIDIA Ampere architecture launched two years ago. Huang Renxun once said at GTC 2022 that 20 H100 GPUs can support traffic equivalent to the global Internet, and can help customers launch advanced recommendation systems and large-scale language models that run data inference in real-time.

The H100, which many AI practitioners are looking forward to, was originally scheduled to be officially shipped in the third quarter of 2022 and is currently in a state of accepting reservations. The actual usage of users and the actual performance of the H100 are still unknown, so it can pass the latest round of MLPerf tests. Score an early feel of the H100's performance.

In this round of testing, compared to Intel Sapphire Rapids, Qualcomm Cloud AI 100, Biren BR104, SAPEON X220-enterprise, NVIDIA H100 not only submitted test scores for all six neural network models in the data center, but also in single server and offline scenarios. Demonstrated leadership in throughput and speed. Compared to the NVIDIA A100, the H100 showed a 4.5x performance improvement in one of the largest and most performance-demanding MLPerf models, the BERT model for natural language processing, and 1 in all five other models. Up to 3x performance improvement. The reason why the H100 can perform well on the BERT model is mainly due to its Transformer Engine.

Among other products that also submitted scores, only Biren BR104 has more than doubled the performance of NVIDIA A100 under the ResNet50 and BERT-Large models in offline scenarios, and other products that submitted scores did not surpass A100 in performance. In the data center and edge computing categories, the A100 GPU's test results are still good, thanks to the continuous improvement of NVIDIA AI software, compared with the MLPerf debut in July 2020, the A100 GPU achieved 6 times the performance.

Since users usually need to use many different types of neural networks to work together in practical applications, for example, an AI application may need to understand the user's voice request, classify images, make suggestions, and then respond with voice, each step needs to be used. Different AI models.

That's why MLPerf benchmarks cover popular AI workloads and scenarios including computer vision, natural language processing, recommender systems, speech recognition, and more to ensure users get reliable and deployment-flexible performance. This also means that the more models covered by the submitted test scores, the better the results, and the more versatile it is AI capabilities are.

In this round of testing, NVIDIA AI remains the only platform capable of running all MLPerf inference workloads and scenarios in data centers and edge computing. On the data center side, both the A100 and H100 submitted six model test scores. In edge computing, NVIDIA Orin ran all MLPerf benchmarks and was the most tested chip of all low-power SoCs.

Orin is the integration of NVIDIA Ampere architecture GPU and Arm CPU core into a single chip, which is mainly used for robots, autonomous machines, medical machinery, and other forms of edge embedded computing. Currently, Orin has been used in the NVIDIA Jetson AGX Orin developer kit as well as robotics and autonomous systems to generate mock tests and supports the full NVIDIA AI software stack, including autonomous vehicle platforms, medical device platforms, and robotics platforms.

Compared to its debut at MLPerf in April, Orin is 50% more energy efficient, running 5x and 2x faster than the previous generation of Jetson AGX Xavier modules, respectively. The pursuit of general purpose NVIDIA AI is being supported by the industry's broad machine learning ecosystem. In this round of benchmarking, more than 70 submissions were run on the NVIDIA platform. For example, Microsoft Azure submitted results running NVIDIA AI on its cloud service.

Post a Comment