Huizi Mao – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-24T18:58:36Z http://www.open-lab.net/blog/feed/ Huizi Mao <![CDATA[NVIDIA Blackwell Delivers World-Record DeepSeek-R1 Inference Performance]]> http://www.open-lab.net/blog/?p=97352 2025-03-24T18:58:36Z 2025-03-18T17:41:42Z NVIDIA announced world-record DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system with eight NVIDIA Blackwell GPUs can achieve over...]]>

NVIDIA announced world-record DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system with eight NVIDIA Blackwell GPUs can achieve over 250 tokens per second per user or a maximum throughput of over 30,000 tokens per second on the massive, state-of-the-art 671 billion parameter DeepSeek-R1 model. These rapid advancements in performance at both ends of the performance…

Source

]]>
1
Huizi Mao <![CDATA[Accelerate Generative AI Inference Performance with NVIDIA TensorRT Model Optimizer, Now Publicly Available]]> http://www.open-lab.net/blog/?p=81860 2024-06-13T22:22:46Z 2024-05-08T19:00:00Z In the fast-evolving landscape of generative AI, the demand for accelerated inference speed remains a pressing concern. With the exponential growth in model...]]>

In the fast-evolving landscape of generative AI, the demand for accelerated inference speed remains a pressing concern. With the exponential growth in model size and complexity, the need to swiftly produce results to serve numerous users simultaneously continues to grow. The NVIDIA platform stands at the forefront of this endeavor, delivering perpetual performance leaps through innovations across…

Source

]]>
3
���˳���97caoporen����