end-to-end AI – NVIDIA Technical Blog

end-to-end AI – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-26T22:01:23Z http://www.open-lab.net/blog/feed/ Kristen Perez <![CDATA[Strategies for Maximizing Data Center Energy Efficiency]]> http://www.open-lab.net/blog/?p=65020 2023-06-09T20:25:50Z 2023-05-23T15:00:00Z

Data centers are an essential part of a modern enterprise, but they come with a hefty energy cost. To complicate matters, energy costs are rising and the need...]]>

Data centers are an essential part of a modern enterprise, but they come with a hefty energy cost. To complicate matters, energy costs are rising and the need... Idealized photo of solar panels and wind turbines in the sunshine, with a city in the background.

Idealized photo of solar panels and wind turbines in the sunshine, with a city in the background.

Data centers are an essential part of a modern enterprise, but they come with a hefty energy cost. To complicate matters, energy costs are rising and the need for data centers continues to expand, with a market size projected to grow 25% from 2023 to 2030. Globally, energy costs are already negatively affecting data centers and high-performance computing (HPC) systems. To alleviate the energy��

]]> 0 Ayesha Asif <![CDATA[End-to-End AI for NVIDIA-Based PCs: Optimizing AI by Transitioning from FP32 to FP16]]> http://www.open-lab.net/blog/?p=63734 2024-08-28T17:41:58Z 2023-04-27T16:00:00Z

This post is part of a series about optimizing end-to-end AI. The performance of AI models is heavily influenced by the precision of the computational resources...]]>

This post is part of a series about optimizing end-to-end AI. The performance of AI models is heavily influenced by the precision of the computational resources... Series image with part 7 caption.

Series image with part 7 caption.

This post is part of a series about optimizing end-to-end AI. The performance of AI models is heavily influenced by the precision of the computational resources being used. Lower precision can lead to faster processing speeds and reduced memory usage, while higher precision can contribute to more accurate results. Finding the right balance between precision and performance is crucial for��

]]> 0 Chris Hebert <![CDATA[End-to-End AI for NVIDIA-Based PCs: ONNX and DirectML]]> http://www.open-lab.net/blog/?p=63715 2025-03-13T20:12:58Z 2023-04-25T15:00:00Z

This post is part of a series about optimizing end-to-end AI. While NVIDIA hardware can process the individual operations that constitute a neural network...]]>

This post is part of a series about optimizing end-to-end AI. While NVIDIA hardware can process the individual operations that constitute a neural network...

End-to-End AI for NVIDIA-Based PCs: ONNX and DirectML

This post is part of a series about optimizing end-to-end AI. While NVIDIA hardware can process the individual operations that constitute a neural network incredibly fast, it is important to ensure that you are using the tools correctly. Using the respective tools such as ONNX Runtime or TensorRT out of the box with ONNX usually gives you good performance, but why settle for good performance��

]]> 0 Maximilian M��ller <![CDATA[End-to-End AI for NVIDIA-Based PCs: NVIDIA TensorRT Deployment]]> http://www.open-lab.net/blog/?p=61010 2023-06-09T22:41:06Z 2023-03-15T16:30:00Z

This post is the fifth in a series about optimizing end-to-end AI. NVIDIA TensorRT is a solution for speed-of-light inference deployment on NVIDIA hardware....]]>

This post is the fifth in a series about optimizing end-to-end AI. NVIDIA TensorRT is a solution for speed-of-light inference deployment on NVIDIA hardware.... Featured image of computer screens in stylized design.

Featured image of computer screens in stylized design.

This post is the fifth in a series about optimizing end-to-end AI. NVIDIA TensorRT is a solution for speed-of-light inference deployment on NVIDIA hardware. Provided with an AI model architecture, TensorRT can be used pre-deployment to run an excessive search for the most efficient execution strategy. TensorRT optimizations include reordering operations in a graph��

]]> 0 Maximilian M��ller <![CDATA[End-to-End AI for NVIDIA-Based PCs: CUDA and TensorRT Execution Providers in ONNX Runtime]]> http://www.open-lab.net/blog/?p=60430 2023-06-12T07:55:55Z 2023-02-08T19:16:29Z

This post is the fourth in a series about optimizing end-to-end AI. As explained in the previous post in the End-to-End AI for NVIDIA-Based PCs series, there...]]>

This post is the fourth in a series about optimizing end-to-end AI. As explained in the previous post in the End-to-End AI for NVIDIA-Based PCs series, there... End-to-end AI series Part 4

End-to-end AI series Part 4

This post is the fourth in a series about optimizing end-to-end AI. As explained in the previous post in the End-to-End AI for NVIDIA-Based PCs series, there are multiple execution providers (EPs) in ONNX Runtime that enable the use of hardware-specific features or optimizations for a given deployment scenario. This post covers the CUDA EP and TensorRT EP using the highly optimized NVIDIA��

]]> 6 Luca Spindler <![CDATA[End-to-End AI for NVIDIA-Based PCs:?ONNX Runtime and Optimization]]> http://www.open-lab.net/blog/?p=58640 2023-06-12T08:21:40Z 2022-12-15T23:40:31Z

This post is the third in a series about optimizing end-to-end AI. When your model has been converted to the ONNX format, there are several ways to deploy it,...]]>

This post is the third in a series about optimizing end-to-end AI. When your model has been converted to the ONNX format, there are several ways to deploy it,... End-to-end AI series Part 3

End-to-end AI series Part 3

This post is the third in a series about optimizing end-to-end AI. When your model has been converted to the ONNX format, there are several ways to deploy it, each with advantages and drawbacks. One method is to use ONNX Runtime. ONNX Runtime serves as the backend, reading a model from an intermediate representation (ONNX), handling the inference session, and scheduling execution on an��

]]> 0 Luca Spindler <![CDATA[End-to-End AI for NVIDIA-Based PCs: Transitioning AI Models with ONNX]]> http://www.open-lab.net/blog/?p=59024 2023-06-12T08:18:51Z 2022-12-15T23:39:40Z

This post is the second in a series about optimizing end-to-end AI. In this post, I discuss how to use ONNX to transition your AI models from research to...]]>

This post is the second in a series about optimizing end-to-end AI. In this post, I discuss how to use ONNX to transition your AI models from research to... End-to-end AI series Part 2

End-to-end AI series Part 2

This post is the second in a series about optimizing end-to-end AI. In this post, I discuss how to use ONNX to transition your AI models from research to production while avoiding common mistakes. Considering that PyTorch has become the most popular machine learning framework, all my examples use it but I also supply references to TensorFlow tutorials. ONNX (Open Neural Network��

]]> 0 Chris Hebert <![CDATA[End-to-End AI for NVIDIA-Based PCs: An Introduction to Optimization]]> http://www.open-lab.net/blog/?p=59060 2023-06-12T08:18:26Z 2022-12-15T23:35:00Z

This post is the first in a series about optimizing end-to-end AI. The great thing about the GPU is that it offers tremendous parallelism; it allows you to...]]>

This post is the first in a series about optimizing end-to-end AI. The great thing about the GPU is that it offers tremendous parallelism; it allows you to... End-to-end AI series Part 1

End-to-end AI series Part 1

This post is the first in a series about optimizing end-to-end AI. The great thing about the GPU is that it offers tremendous parallelism; it allows you to perform many tasks at the same time. At its most granular level, this comes down to the fact that there are thousands of tiny processing cores that run the same instruction at the same time. But that is not where such parallelism stops.

]]> 1 ��˳��97caoporen��