PitchSend

Alliance for OpenUSD

Training Compute (petaFLOPS) 1010 109 Modern Al is a Data Center Scale Computing Workload Data centers are becoming Al factories: data as input, intelligence as output Al Training Computational Requirements All Al Models Excluding Transformers: 8X/2yrs Transformer Al Models: 275X/2yrs Megatron-Turing NLG 530B GPT-3 108 Microsoft T-NLG GPT-2 107 Megatron Wav2Vec 2.0 XLNet❤ MoCo ResNet50 106 Xception InceptionV3 105 Resnet Seq2Seq BERT Large GPT-1 Transformer 104 ResNext VGG 19 ELMO DenseNet201 103 AlexNet 102 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 Fueling Giant-Scale Al Infrastructure NVIDIA compute & networking GPU | DPU | CPU 8896 8898 Large Language Models, based on the Transformer architecture, are one of today's most important advanced Al technologies, involving up to trillions of parameters that learn from text. Developing them is an expensive, time-consuming process that demands deep technical expertise, distributed data center-scale infrastructure, and a full-stack accelerated computing approach. NVIDIA

View entire presentation