NVIDIA RTX Ada GPUs and Financial Overview
Training Compute (petaFLOPS)
1010
109
Modern Al is a Data Center Scale Computing Workload
Data centers are becoming Al factories: data as input, intelligence as output
Al Training Computational Requirements
All Al Models Excluding Transformers: 8X/2yrs
Transformer Al Models: 275X/2yrs
Megatron-Turing
NLG 530B
GPT-3
108
Microsoft T-NLG
GPT-2
107
Megatron
Wav2Vec 2.0
XLNet❤
MoCo ResNet50
106
Xception
InceptionV3
105
Resnet
Seq2Seq
BERT Large
GPT-1
Transformer
104
ResNext
VGG 19
ELMO
DenseNet201
103
AlexNet
102
2012 2013
2014
2015 2016 2017 2018 2019 2020
2021
2022
Fueling Giant-Scale Al Infrastructure
NVIDIA compute & networking GPU | DPU | CPU
8896
8898
Large Language Models, based on the Transformer architecture, are one of
today's most important advanced Al technologies, involving up to trillions of
parameters that learn from text.
Developing them is an expensive, time-consuming process that demands deep
technical expertise, distributed data center-scale infrastructure, and a full-stack
accelerated computing approach.
NVIDIAView entire presentation