NVIDIA Investor Presentation Deck slide image

NVIDIA Investor Presentation Deck

NVIDIA NVIDIA Sets New LLM Training Record With Largest MLPerf Submission Ever ● NVIDIA set six new performance records in this round, with the performance increase stemming from a combination of advances in software and scaled-up hardware 2.8x faster on generative Al - completing a training benchmark based on a GPT-3 model with 175 billion parameters trained on 1 billion tokens in just 3.9 minutes • 1.6x faster on training recommender models 1.8x faster on training computer vision models The GPT-3 benchmark ran on NVIDIA Eos - a new Al supercomputer powered by 10,752 H100 GPUs and NVIDIA Quantum-2 InfiniBand networking The 10,752 H100 GPUs far surpassed the scaling in Al training in June, when NVIDIA used 3,584 Hopper GPUs The 3x scaling in GPU numbers delivered a 2.8x scaling in performance, a 93% efficiency rate thanks in part to software optimizations Microsoft Azure achieved similar results on a nearly identical cluster, demonstrating the efficiency of NVIDIA Al in public cloud deployments 33 Six New Performance Records The fastest gets even faster GPT-3 175B (1B Tokens) 3.9 Minutes 2.8X Faster DLRM-dcnv2 1 Minute 1.6X Faster RetinaNet 55.2 Seconds 1.8X Faster Stable Diffusion 2.5 Minutes New Workload BERT-Large 7.2 Seconds 1.1X Faster 3D U-Net 46 Seconds 1.07X Faster MLPerfTM Training v3.1. Results retrieved from www.mlperf.org on November 8, 2023. Format: Chip Count, MLPerf ID | GPT-3: 3584x 3.0-2003, 10752x 3.1-2007 | Stable Diffusion: 1024x 3.1-2050 | DLRMv2: 128x 3.0-2065, 128x 3.1-2051 | BERT-Large: 3072x 3.0-2001, 3472x 3.1-2053| RetinaNet: 768x 3.0-2077, 2048x 3.1-2052 | 3D U-Net: 432x 3.0-2067, 768x 3.1-2064. The MLPerfTM name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.
View entire presentation