Jinho Suh – NVIDIA Technical Blog

Jinho Suh – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-11-14T15:53:12Z http://www.open-lab.net/blog/feed/ Jinho Suh <![CDATA[NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records]]> http://www.open-lab.net/blog/?p=80197 2024-11-14T15:53:12Z 2024-03-27T15:29:05Z

Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI...]]>

Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI models—including large language models (LLMs)—are used for crafting marketing copy, writing computer code, rendering detailed images, composing music, generating videos, and more. The amount of compute required by the latest models is immense and…

]]> Jinho Suh <![CDATA[Leading MLPerf Inference v3.1 Results with NVIDIA GH200 Grace Hopper Superchip Debut]]> http://www.open-lab.net/blog/?p=70450 2023-09-22T16:17:33Z 2023-09-09T16:00:00Z

AI is transforming computing, and inference is how the capabilities of AI are deployed in the world��s applications. Intelligent chatbots, image and video...]]>

AI is transforming computing, and inference is how the capabilities of AI are deployed in the world’s applications. Intelligent chatbots, image and video synthesis from simple text prompts, personalized content recommendations, and medical imaging are just a few examples of AI-powered applications. Inference workloads are both computationally demanding and diverse, requiring that platforms be…

]]> 1 Jinho Suh <![CDATA[New MLPerf Inference Network Division Showcases NVIDIA InfiniBand and GPUDirect RDMA Capabilities]]> http://www.open-lab.net/blog/?p=67021 2023-07-27T18:54:26Z 2023-07-06T16:00:00Z

In MLPerf Inference v3.0, NVIDIA made its first submissions to the newly introduced Network division, which is now part of the MLPerf Inference Datacenter...]]>

In MLPerf Inference v3.0, NVIDIA made its first submissions to the newly introduced Network division, which is now part of the MLPerf Inference Datacenter suite. The Network division is designed to simulate a real data center setup and strives to include the effect of networking—including both hardware and software—in end-to-end inference performance. In the Network division…

]]> 0 Jinho Suh <![CDATA[Setting New Records in MLPerf Inference v3.0 with Full-Stack Optimizations for AI]]> http://www.open-lab.net/blog/?p=62958 2023-07-05T19:23:50Z 2023-04-05T19:10:55Z

The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment...]]>

The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment scenarios. High-performance, accelerated AI platforms are needed to meet the demands of these applications and deliver the best user experiences. New AI models are constantly being invented to enable new capabilities…

]]> 0 Jinho Suh <![CDATA[Getting the Best Performance on MLPerf Inference 2.0]]> http://www.open-lab.net/blog/?p=46305 2023-07-05T19:28:16Z 2022-04-07T00:26:22Z

Models like Megatron 530B are expanding the range of problems AI can address. However, as models continue to grow complexity, they pose a twofold challenge for...]]>

Models like Megatron 530B are expanding the range of problems AI can address. However, as models continue to grow complexity, they pose a twofold challenge for AI compute platforms: What’s needed is a versatile AI platform that can deliver the needed performance on a wide variety of models for both training and inference. To evaluate that performance, MLPerf is the only industry…

]]> 0 ��˳��97caoporen��