NVIDIA Investor Presentation Deck slide image

NVIDIA Investor Presentation Deck

Dense Matrix Structural Sparsity Brings Additional Speedups. Sparse Matrix 2X Faster Execution T A100 Tensor Core BERT LARGE INFERENCE BERT Large Inference | Precision = INT8 with and without sparsity | Batch sizes - no sparsity: bs256, with sparsity: bs49, A100 with 7 MIGS 1x A100 1.5x A100 Sparsity STRUCTURED SPARSITY Half the values are zero Skip half of the compute and mem fetches Compute up to 2x rate vs non-sparse
View entire presentation