PitchSend

NVIDIA Investor Presentation Deck

Al Application Query Result Standard HTTP/GRPC Triton Inference Server Open-source Software for Scalable, Simplified Inference Serving Dynamic Batching (Real time, Batch, Stream) Multiple GPU & CPU Backends 55 TensorFlow PYTORCH TensorRT ONNX Custom Per Model Scheduler Queues Utilization, Throughput, Latency Metrics GPU CPU Flexible Model Loading (All, Selective) Metrics Model Store Kubernetes, Prometheus

View entire presentation