Josh Park – NVIDIA Technical Blog

Josh Park – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-07T20:13:46Z http://www.open-lab.net/blog/feed/ Josh Park <![CDATA[Streamline LLM Deployment for Autonomous Vehicle Applications with NVIDIA DriveOS LLM SDK]]> http://www.open-lab.net/blog/?p=96776 2025-03-07T20:13:46Z 2025-03-10T19:30:00Z

Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of...]]>

Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of applications, including translation, digital assistants, recommendation systems, context analysis, code generation, cybersecurity, and more. In automotive applications, there is growing demand for LLM-based solutions for both autonomous driving and…

]]> 1 Josh Park <![CDATA[Emulating the Attention Mechanism in Transformer Models with a Fully Convolutional Network]]> http://www.open-lab.net/blog/?p=75844 2024-02-08T18:51:54Z 2024-01-29T17:00:00Z

The past decade has seen a remarkable surge in the adoption of deep learning techniques for computer vision (CV) tasks. Convolutional neural networks (CNNs)...]]>

The past decade has seen a remarkable surge in the adoption of deep learning techniques for computer vision (CV) tasks. Convolutional neural networks (CNNs) have been the cornerstone of this revolution, exhibiting exceptional performance and enabling significant advancements in visual perception. By employing localized filters and hierarchical architectures, CNNs have proven adept at…

]]> 0 Josh Park <![CDATA[Sparsity in INT8: Training Workflow and Best Practices for NVIDIA TensorRT Acceleration]]> http://www.open-lab.net/blog/?p=64658 2023-06-09T20:26:40Z 2023-05-16T16:00:00Z

The training stage of deep learning (DL) models consists of learning numerous dense floating-point weight matrices, which results in a massive amount of...]]>

The training stage of deep learning (DL) models consists of learning numerous dense floating-point weight matrices, which results in a massive amount of floating-point computations during inference. Research has shown that many of those computations can be skipped by forcing some weights to be zero, with little impact on the final accuracy. In parallel to that, previous posts have shown that…

]]> 0 Josh Park <![CDATA[Accelerating Quantized Networks with the NVIDIA QAT Toolkit for TensorFlow and NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=48838 2023-04-04T17:00:05Z 2022-06-16T17:28:18Z

We��re excited to announce the NVIDIA Quantization-Aware Training (QAT) Toolkit for TensorFlow 2 with the goal of accelerating the quantized networks with...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. We’re excited to announce the NVIDIA Quantization-Aware Training (QAT) Toolkit for TensorFlow 2 with the goal of accelerating the quantized networks with NVIDIA TensorRT on NVIDIA GPUs. This toolkit provides you with an easy-to-use API to quantize…

]]> 0 Josh Park <![CDATA[Speeding Up Deep Learning Inference Using TensorFlow, ONNX, and NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=16755 2022-08-21T23:39:52Z 2021-07-20T13:00:00Z

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. In this post, you learn how to deploy TensorFlow trained deep learning models using...]]>

]]> 19 Josh Park <![CDATA[Speeding Up Deep Learning Inference Using NVIDIA TensorRT (Updated)]]> http://www.open-lab.net/blog/?p=34881 2022-10-10T18:51:45Z 2021-07-20T13:00:00Z

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. NVIDIA TensorRT is an SDK for deep learning inference. TensorRT provides APIs and...]]>

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. NVIDIA TensorRT is an SDK for deep learning inference. TensorRT provides APIs and parsers to import trained models from all major deep learning frameworks. It then generates optimized runtime engines deployable in the datacenter as well as in automotive and embedded environments. This post provides a simple…

]]> 5 Josh Park <![CDATA[Discovering GPU-friendly Deep Neural Networks with Unified Neural Architecture Search]]> http://www.open-lab.net/blog/?p=21847 2022-08-21T23:40:45Z 2020-11-05T21:29:02Z

After the first successes of deep learning, designing neural network architectures with desirable performance criteria for a given task (for example, high...]]>

After the first successes of deep learning, designing neural network architectures with desirable performance criteria for a given task (for example, high accuracy or low latency) has been a challenging problem. Some call it alchemy and some intuition, but the task of discovering a novel architecture often involves a tedious and costly trial-and-error process of searching in an exponentially large…

]]> 0 Josh Park <![CDATA[Estimating Depth with ONNX Models and Custom Layers Using NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=20731 2022-10-10T19:00:08Z 2020-09-24T18:20:20Z

TensorRT is an SDK for high performance, deep learning inference. It includes a deep learning inference optimizer and a runtime that delivers low latency and...]]>

TensorRT is an SDK for high performance, deep learning inference. It includes a deep learning inference optimizer and a runtime that delivers low latency and high throughput for deep learning applications. TensorRT uses the ONNX format as an intermediate representation for converting models from major frameworks such as TensorFlow and PyTorch. In this post, you learn how to convert PyTorch…

]]> 5 Josh Park <![CDATA[Speeding Up Deep Learning Inference Using TensorRT]]> http://www.open-lab.net/blog/?p=17026 2022-10-10T18:51:44Z 2020-04-22T00:39:30Z

[stextbox id="info"]Looking for more? Check out the hands-on DLI training course: Optimization and Deployment of TensorFlow Models with TensorRT[/stextbox] This...]]>

]]> 5 Josh Park <![CDATA[Object Detection and Lane Segmentation Using Multiple Accelerators with DRIVE AGX]]> http://www.open-lab.net/blog/?p=14880 2023-02-13T17:04:00Z 2019-06-20T13:00:32Z

[caption id="attachment_14898" align="alignright" width="610"] DRIVE AGX is NVIDIA's platform for autonomous driving[/caption] Autonomous vehicles require fast...]]>

Autonomous vehicles require fast and accurate perception of the surrounding environment in order to accomplish a wide set of tasks concurrently in real time. Systems need to handle the detection of obstacles, determine the boundaries of lanes, intersection detection, and sign recognition among many more functions over a large variety of environments, conditions, and situations and do this work…

]]> 0 ��˳��97caoporen��