Ashwin Nanjappa – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-11-14T15:53:12Z http://www.open-lab.net/blog/feed/ Ashwin Nanjappa <![CDATA[NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1]]> http://www.open-lab.net/blog/?p=87957 2024-09-05T17:57:17Z 2024-08-28T15:00:00Z Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a...]]>

Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a highly optimized inference engine are required for high-throughput, low-latency inference. MLPerf Inference v4.1 is the latest version of the popular and widely recognized MLPerf Inference benchmarks, developed by the MLCommons…

Source

]]>
1
Ashwin Nanjappa <![CDATA[NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records]]> http://www.open-lab.net/blog/?p=80197 2024-11-14T15:53:12Z 2024-03-27T15:29:05Z Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI...]]>

Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI models—including large language models (LLMs)—are used for crafting marketing copy, writing computer code, rendering detailed images, composing music, generating videos, and more. The amount of compute required by the latest models is immense and…

Source

]]>
Ashwin Nanjappa <![CDATA[Leading MLPerf Inference v3.1 Results with NVIDIA GH200 Grace Hopper Superchip Debut]]> http://www.open-lab.net/blog/?p=70450 2023-09-22T16:17:33Z 2023-09-09T16:00:00Z AI is transforming computing, and inference is how the capabilities of AI are deployed in the world��s applications. Intelligent chatbots, image and video...]]>

AI is transforming computing, and inference is how the capabilities of AI are deployed in the world’s applications. Intelligent chatbots, image and video synthesis from simple text prompts, personalized content recommendations, and medical imaging are just a few examples of AI-powered applications. Inference workloads are both computationally demanding and diverse, requiring that platforms be…

Source

]]>
1
Ashwin Nanjappa <![CDATA[New MLPerf Inference Network Division Showcases NVIDIA InfiniBand and GPUDirect RDMA Capabilities]]> http://www.open-lab.net/blog/?p=67021 2023-07-27T18:54:26Z 2023-07-06T16:00:00Z In MLPerf Inference v3.0, NVIDIA made its first submissions to the newly introduced Network division, which is now part of the MLPerf Inference Datacenter...]]>

In MLPerf Inference v3.0, NVIDIA made its first submissions to the newly introduced Network division, which is now part of the MLPerf Inference Datacenter suite. The Network division is designed to simulate a real data center setup and strives to include the effect of networking—including both hardware and software—in end-to-end inference performance. In the Network division…

Source

]]>
0
Ashwin Nanjappa <![CDATA[Setting New Records in MLPerf Inference v3.0 with Full-Stack Optimizations for AI]]> http://www.open-lab.net/blog/?p=62958 2023-07-05T19:23:50Z 2023-04-05T19:10:55Z The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment...]]>

The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment scenarios. High-performance, accelerated AI platforms are needed to meet the demands of these applications and deliver the best user experiences. New AI models are constantly being invented to enable new capabilities…

Source

]]>
0
Ashwin Nanjappa <![CDATA[Full-Stack Innovation Fuels Highest MLPerf Inference 2.1 Results for NVIDIA]]> http://www.open-lab.net/blog/?p=54638 2023-07-05T19:26:31Z 2022-09-08T18:10:00Z Today��s AI-powered applications are enabling richer experiences, fueled by both larger and more complex AI models as well as the application of many models in...]]>

Today’s AI-powered applications are enabling richer experiences, fueled by both larger and more complex AI models as well as the application of many models in a pipeline. To meet the increasing demands of AI-infused applications, an AI platform must not only deliver high performance but also be versatile enough to deliver that performance across a diverse range of AI models.

Source

]]>
0
Ashwin Nanjappa <![CDATA[Getting the Best Performance on MLPerf Inference 2.0]]> http://www.open-lab.net/blog/?p=46305 2023-07-05T19:28:16Z 2022-04-07T00:26:22Z Models like Megatron 530B are expanding the range of problems AI can address. However, as models continue to grow complexity, they pose a twofold challenge for...]]>

Models like Megatron 530B are expanding the range of problems AI can address. However, as models continue to grow complexity, they pose a twofold challenge for AI compute platforms: What’s needed is a versatile AI platform that can deliver the needed performance on a wide variety of models for both training and inference. To evaluate that performance, MLPerf is the only industry…

Source

]]>
0
���˳���97caoporen����