Posts by Zhihan Jiang
Data Center / Cloud
Apr 02, 2025
NVIDIA Blackwell Delivers Massive Performance Leaps in MLPerf Inference v5.0
The compute demands for large language model (LLM) inference are growing rapidly, fueled by the combination of growing model sizes, real-time latency...
9 MIN READ
Data Center / Cloud
Aug 28, 2024
NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1
Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a...
13 MIN READ
Generative AI
Mar 27, 2024
NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records
Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI...
11 MIN READ
Data Center / Cloud
Sep 09, 2023
Leading MLPerf Inference v3.1 Results with NVIDIA GH200 Grace Hopper Superchip Debut
AI is transforming computing, and inference is how the capabilities of AI are deployed in the world’s applications. Intelligent chatbots, image and video...
13 MIN READ
Data Center / Cloud
Apr 05, 2023
Setting New Records in MLPerf Inference v3.0 with Full-Stack Optimizations for AI
The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment...
15 MIN READ
Simulation / Modeling / Design
Sep 08, 2022
Full-Stack Innovation Fuels Highest MLPerf Inference 2.1 Results for NVIDIA
Today’s AI-powered applications are enabling richer experiences, fueled by both larger and more complex AI models as well as the application of many models in...
14 MIN READ