Jay Rodge – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-18T18:23:55Z http://www.open-lab.net/blog/feed/ Jay Rodge <![CDATA[Creating RAG-Based Question-and-Answer LLM Workflows at NVIDIA]]> http://www.open-lab.net/blog/?p=90872 2024-11-11T20:00:23Z 2024-10-28T16:00:00Z The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system...]]>

The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system architectures. Our work at NVIDIA using AI for internal operations has led to several important findings for finding alignment between system capabilities and user expectations. We found that regardless of the intended scope or use case…

Source

]]>
Jay Rodge <![CDATA[Deploying Accelerated Llama 3.2 from the Edge to the Cloud]]> http://www.open-lab.net/blog/?p=89436 2024-11-07T05:08:12Z 2024-09-25T18:39:49Z Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an...]]>

Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an updated Llama Guard model with support for vision. When paired with the NVIDIA accelerated computing platform, Llama 3.2 offers developers, researchers, and enterprises valuable new capabilities and optimizations to realize their…

Source

]]>
Jay Rodge <![CDATA[Generative AI Agents Developer Contest: Top Tips for Getting Started]]> http://www.open-lab.net/blog/?p=82980 2024-10-18T20:21:31Z 2024-05-29T16:01:10Z Join our contest that runs through June 17 and showcase your innovation using cutting-edge generative AI-powered applications using NVIDIA and LangChain...]]>

Join our contest that runs through June 17 and showcase your innovation using cutting-edge generative AI-powered applications using NVIDIA and LangChain technologies. To get you started, we explore a few applications for inspiring your creative journey, while sharing tips and best practices to help you succeed in the development process. There are many different practical applications…

Source

]]>
Jay Rodge <![CDATA[RAPIDS cuDF Accelerates pandas Nearly 150x with Zero Code Changes]]> http://www.open-lab.net/blog/?p=72591 2024-05-15T15:55:04Z 2024-03-18T22:00:00Z At NVIDIA GTC 2024, it was announced that RAPIDS cuDF can now bring GPU acceleration to 9.5M million pandas users without requiring them to change their code....]]>

At NVIDIA GTC 2024, it was announced that RAPIDS cuDF can now bring GPU acceleration to 9.5M million pandas users without requiring them to change their code. Update: RAPIDS cuDF now instantly accelerates pandas with zero code changes in Google Colab. Try out the tutorial in a Colab notebook today. pandas, a flexible and powerful data analysis and manipulation library for Python…

Source

]]>
5
Jay Rodge <![CDATA[Accelerated Data Analytics: Machine Learning with GPU-Accelerated Pandas and Scikit-learn]]> http://www.open-lab.net/blog/?p=67937 2024-05-15T16:11:39Z 2023-07-11T20:00:00Z If you are looking to take your machine learning (ML) projects to new levels of speed and scalability, GPU-accelerated data analytics can help you deliver...]]>

If you are looking to take your machine learning (ML) projects to new levels of speed and scalability, GPU-accelerated data analytics can help you deliver insights quickly with breakthrough performance. From faster computation to efficient model training, GPUs bring many benefits to everyday ML tasks. Update: The below blog describes how to use GPU-only RAPIDS cuDF…

Source

]]>
0
Jay Rodge <![CDATA[Optimizing and Serving Models with NVIDIA TensorRT and NVIDIA Triton]]> http://www.open-lab.net/blog/?p=50553 2025-03-18T18:23:55Z 2022-07-20T16:00:00Z Imagine that you have trained your model with PyTorch, TensorFlow, or the framework of your choice, are satisfied with its accuracy, and are considering...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. As of March 18, 2025, NVIDIA Triton Inference Server is now part of the NVIDIA Dynamo Platform and has been renamed to NVIDIA Dynamo Triton, accordingly. Imagine that you have trained your model with PyTorch, TensorFlow, or the framework of…

Source

]]>
1
Jay Rodge <![CDATA[NVIDIA Announces TensorRT 8.2 and Integrations with PyTorch and TensorFlow]]> http://www.open-lab.net/blog/?p=41607 2022-11-14T22:22:08Z 2021-12-02T17:00:00Z Today NVIDIA released TensorRT 8.2, with optimizations for billion parameter NLU models. These include T5 and GPT-2, used for translation and text generation,...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. Today NVIDIA released TensorRT 8.2, with optimizations for billion parameter NLU models. These include T5 and GPT-2, used for translation and text generation, making it possible to run NLU apps in real time. TensorRT is a high-performance…

Source

]]>
0
Jay Rodge <![CDATA[Optimizing T5 and GPT-2 for Real-Time Inference with NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=41964 2023-06-12T21:06:31Z 2021-12-02T17:00:00Z The transformer architecture has wholly transformed (pun intended) the domain of natural language processing (NLP). Over the recent years, many novel network...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. The transformer architecture has wholly transformed (pun intended) the domain of natural language processing (NLP). Over the recent years, many novel network architectures have been built on the transformer building blocks: BERT, GPT, and T5…

Source

]]>
4
Jay Rodge <![CDATA[ICYMI: New AI Tools and Technologies Announced at NVIDIA GTC Keynote]]> http://www.open-lab.net/blog/?p=39300 2023-03-22T01:16:48Z 2021-11-09T19:08:00Z At NVIDIA GTC this November, new software tools were announced that help developers build real-time speech applications, optimize inference for a variety of...]]>

At NVIDIA GTC this November, new software tools were announced that help developers build real-time speech applications, optimize inference for a variety of use-cases, optimize open-source interoperability for recommender systems, and more. Watch the keynote from CEO, Jensen Huang, to learn about the latest NVIDIA breakthroughs. Today, NVIDIA unveiled a new version of NVIDIA Riva with a…

Source

]]>
0
Jay Rodge <![CDATA[NVIDIA GTC: Can��t-Miss Sessions in AI and Deep Learning this November]]> http://www.open-lab.net/blog/?p=38083 2022-08-21T23:52:45Z 2021-10-05T15:00:00Z Join NVIDIA November 8-11, showcasing over 500 GTC sessions covering the latest breakthroughs in AI and deep learning, as well as many other GPU technology...]]>

Join NVIDIA November 8-11, showcasing over 500 GTC sessions covering the latest breakthroughs in AI and deep learning, as well as many other GPU technology interest areas. Below is a preview of some of the top AI and deep learning sessions including topics such as training, inference, frameworks, and tools—featuring speakers from NVIDIA. Deep Learning Demystified: AI has evolved and…

Source

]]>
0
Jay Rodge <![CDATA[NVIDIA Announces TensorRT 8 Slashing BERT-Large Inference Down to 1 Millisecond]]> http://www.open-lab.net/blog/?p=34937 2024-10-28T19:28:53Z 2021-07-20T14:28:27Z Today, NVIDIA announced TensorRT 8.0 which brings BERT-Large inference latency down to 1.2 ms with new optimizations. This version also delivers 2x the accuracy...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. Today, NVIDIA announced TensorRT 8.0 which brings BERT-Large inference latency down to 1.2 ms with new optimizations. This version also delivers 2x the accuracy for INT8 precision with Quantization Aware Training…

Source

]]>
0
Jay Rodge <![CDATA[Achieving FP32 Accuracy for INT8 Inference Using Quantization Aware Training with NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=34216 2023-06-12T21:09:34Z 2021-07-20T13:00:00Z Deep learning is revolutionizing the way that industries are delivering products and services. These services include object detection, classification, and...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. Deep learning is revolutionizing the way that industries are delivering products and services. These services include object detection, classification, and segmentation for computer vision, and text extraction, classification…

Source

]]>
1
Jay Rodge <![CDATA[Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=34218 2023-06-12T21:09:10Z 2021-07-20T13:00:00Z This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. When deploying a neural network, it's useful to think about how the network could be...]]>

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. When deploying a neural network, it’s useful to think about how the network could be made to run faster or take less space. A more efficient network can make better…

Source

]]>
13
���˳���97caoporen����