NVIDIA Investor Presentation Deck
NVIDIA
NVIDIA Sets New LLM Training
Record With Largest MLPerf
Submission Ever
●
NVIDIA set six new performance records in this round,
with the performance increase stemming from a combination
of advances in software and scaled-up hardware
2.8x faster on generative Al - completing a training benchmark
based on a GPT-3 model with 175 billion parameters trained on
1 billion tokens in just 3.9 minutes
• 1.6x faster on training recommender models
1.8x faster on training computer vision models
The GPT-3 benchmark ran on NVIDIA Eos - a new Al
supercomputer powered by 10,752 H100 GPUs and NVIDIA
Quantum-2 InfiniBand networking
The 10,752 H100 GPUs far surpassed the scaling in Al training
in June, when NVIDIA used 3,584 Hopper GPUs
The 3x scaling in GPU numbers delivered a 2.8x scaling in
performance, a 93% efficiency rate thanks in part to software
optimizations
Microsoft Azure achieved similar results on a nearly identical
cluster, demonstrating the efficiency of NVIDIA Al in public
cloud deployments
33
Six New Performance Records
The fastest gets even faster
GPT-3 175B (1B Tokens)
3.9 Minutes
2.8X Faster
DLRM-dcnv2
1 Minute
1.6X Faster
RetinaNet
55.2 Seconds
1.8X Faster
Stable Diffusion
2.5 Minutes
New Workload
BERT-Large
7.2 Seconds
1.1X Faster
3D U-Net
46 Seconds
1.07X Faster
MLPerfTM Training v3.1. Results retrieved from www.mlperf.org on November 8, 2023. Format: Chip Count, MLPerf ID | GPT-3:
3584x 3.0-2003, 10752x 3.1-2007 | Stable Diffusion: 1024x 3.1-2050 | DLRMv2: 128x 3.0-2065, 128x 3.1-2051 | BERT-Large:
3072x 3.0-2001, 3472x 3.1-2053|
RetinaNet: 768x 3.0-2077, 2048x 3.1-2052 | 3D U-Net: 432x 3.0-2067, 768x 3.1-2064. The MLPerfTM name and logo are
trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use
strictly prohibited. See www.mlcommons.org for more information.View entire presentation