NVIDIA Financial and Market Overview
NVIDIA Sets New LLM Training
Record With Largest MLPerf
Submission Ever
168
MOPLE
ALL
INI
111
Six New Performance Records
The fastest gets even faster
NVIDIA set six new performance records in this round,
with the performance increase stemming from a combination
of advances in software and scaled-up hardware
2.8x faster on generative Al - completing a training benchmark
based on a GPT-3 model with 175 billion parameters trained on
1 billion tokens in just 3.9 minutes
1.6x faster on training recommender models
1.8x faster on training computer vision models
The GPT-3 benchmark ran on NVIDIA Eos - a new Al
supercomputer powered by 10,752 H100 GPUs and NVIDIA
Quantum-2 InfiniBand networking
• The 10,752 H100 GPUs far surpassed the scaling in Al training
in June, when NVIDIA used 3,584 Hopper GPUs
The 3x scaling in GPU numbers delivered a 2.8x scaling in
performance, a 93% efficiency rate thanks in part to software
optimizations
Microsoft Azure achieved similar results on a nearly identical
cluster, demonstrating the efficiency of NVIDIA AI in public
cloud deployments
GPT-3 175B (1B Tokens)
3.9 Minutes
2.8X Faster
Stable Diffusion
2.5 Minutes
New Workload
DLRM-dcnv2
1 Minute
1.6X Faster
BERT-Large
7.2 Seconds
1.1X Faster
RetinaNet
55.2 Seconds
1.8X Faster
3D U-Net
46 Seconds
1.07X Faster
MLPerf™ Training v3.1. Results retrieved from www.mlperf.org on November 8, 2023. Format: Chip Count, MLPerf ID | GPT-3:
3584x 3.0-2003, 10752x 3.1-2007 | Stable Diffusion: 1024x 3.1-2050 | DLRMv2: 128x 3.0-2065, 128x 3.1-2051 | BERT-Large:
3072x 3.0-2001, 3472x 3.1-2053 |
RetinaNet: 768x 3.0-2077, 2048x 3.1-2052 | 3D U-Net: 432x 3.0-2067, 768x 3.1-2064. The MLPerf™ name and logo are
trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use
strictly prohibited. See www.mlcommons.org for more information.
NVIDIAView entire presentation