Ryan Olson – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-20T22:35:24Z http://www.open-lab.net/blog/feed/ Ryan Olson <![CDATA[Introducing NVIDIA Dynamo, A Low-Latency Distributed Inference Framework for Scaling Reasoning AI Models]]> http://www.open-lab.net/blog/?p=95274 2025-03-20T22:35:24Z 2025-03-18T17:50:00Z NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for...]]>

NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for deploying generative AI and reasoning models in large-scale distributed environments. The framework boosts the number of requests served by up to 30x, when running the open-source DeepSeek-R1 models on NVIDIA Blackwell.

Source

]]>
Ryan Olson <![CDATA[Deploying Deep Neural Networks with NVIDIA TensorRT]]> http://www.open-lab.net/blog/parallelforall/?p=7674 2022-08-21T23:38:08Z 2017-04-03T05:53:05Z [caption id="attachment_7710" align="alignright" width="300"] Figure 1: NVIDIA Tensor RT provides 16x higher energy efficiency for neural network inference with...]]>

Editor’s Note: An updated version of this, with additional tutorial content, is now available. See “How to Speed Up Deep Learning Using TensorRT“. NVIDIA TensorRT is a high-performance deep learning inference library for production environments. Power efficiency and speed of response are two key metrics for deployed deep learning applications, because they directly affect the user experience…

Source

]]>
17
Ryan Olson <![CDATA[NVIDIA Docker: GPU Server Application Deployment Made Easy]]> http://www.open-lab.net/blog/parallelforall/?p=6854 2022-08-21T23:37:54Z 2016-06-28T02:35:23Z Over the last few years there has been a dramatic rise in the use of containers for deploying data center applications at scale. The reason for this is simple:...]]>

Over the last few years there has been a dramatic rise in the use of containers for deploying data center applications at scale. The reason for this is simple: containers encapsulate an application’s dependencies to provide reproducible and reliable execution of applications and services without the overhead of a full virtual machine. If you have ever spent a day provisioning a server with a…

Source

]]>
28
Ryan Olson <![CDATA[Production Deep Learning with NVIDIA GPU Inference Engine]]> http://www.open-lab.net/blog/parallelforall/?p=6823 2022-08-21T23:37:54Z 2016-06-20T06:01:59Z [caption id="attachment_6824" align="alignright" width="300"] Figure 1. NVIDIA GPU Inference Engine (GIE) provides even higher efficiency and performance for...]]>

[Update September 13, 2016: GPU Inference Engine is now TensorRT] Today at ICML 2016, NVIDIA announced its latest Deep Learning SDK updates, including DIGITS 4, cuDNN 5.1 (CUDA Deep Neural Network Library) and the new GPU Inference Engine. NVIDIA GPU Inference Engine (GIE) is a high-performance deep learning inference solution for production environments. Power efficiency and speed of response…

Source

]]>
18
���˳���97caoporen����