Elias Bermudez – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-04-03T18:44:20Z http://www.open-lab.net/blog/feed/ Elias Bermudez <![CDATA[LLM Benchmarking: Fundamental Concepts]]> http://www.open-lab.net/blog/?p=98215 2025-04-03T18:44:20Z 2025-04-02T17:00:00Z The past few years have witnessed the rise in popularity of generative AI and large language models (LLMs), as part of a broad AI revolution. As LLM-based...]]>

The past few years have witnessed the rise in popularity of generative AI and large language models (LLMs), as part of a broad AI revolution. As LLM-based applications are rolled out across enterprises, there is a need to determine the cost efficiency of different AI serving solutions. The cost of an LLM application deployment depends on how many queries it can process per second while being…

Source

]]>
Elias Bermudez <![CDATA[Measuring Generative AI Model Performance Using NVIDIA GenAI-Perf and an OpenAI-Compatible API]]> http://www.open-lab.net/blog/?p=85839 2024-08-22T18:25:47Z 2024-08-01T15:00:00Z NVIDIA offers tools like Perf Analyzer and Model Analyzer to assist machine learning engineers with measuring and balancing the trade-off between latency and...]]>

NVIDIA offers tools like Perf Analyzer and Model Analyzer to assist machine learning engineers with measuring and balancing the trade-off between latency and throughput, crucial for optimizing ML inference performance. Model Analyzer has been embraced by leading organizations such as Snap to identify optimal configurations that enhance throughput and reduce deployment costs. However…

Source

]]>
���˳���97caoporen����