Generative AI – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-04-08T22:16:15Z http://www.open-lab.net/blog/feed/ Chris Alexiuk <![CDATA[Build Enterprise AI Agents with Advanced Open NVIDIA Llama Nemotron Reasoning Models]]> http://www.open-lab.net/blog/?p=97155 2025-04-08T22:16:15Z 2025-04-08T22:05:00Z This updated post was originally published on March 18, 2025. Organizations are embracing AI agents to enhance productivity and streamline operations. To...]]>

This updated post was originally published on March 18, 2025. Organizations are embracing AI agents to enhance productivity and streamline operations. To maximize their impact, these agents need strong reasoning abilities to navigate complex problems, uncover hidden connections, and make logical decisions autonomously in dynamic environments. Due to their ability to tackle complex…

Source

]]>
Vinay Raman <![CDATA[Evaluating and Enhancing RAG Pipeline Performance Using Synthetic Data?]]> http://www.open-lab.net/blog/?p=97927 2025-04-07T18:39:10Z 2025-04-07T18:39:06Z As large language models (LLM) gain popularity in various question-answering systems, retrieval-augmented generation (RAG) pipelines have also become a focal...]]>

As large language models (LLM) gain popularity in various question-answering systems, retrieval-augmented generation (RAG) pipelines have also become a focal point. RAG pipelines combine the generation power of LLMs with external data sources and retrieval mechanisms, enabling models to access domain-specific information that may not have existed during fine-tuning.

Source

]]>
Elias Wolfberg <![CDATA[Startups Use AI to Deliver Better Maternal and Newborn Care]]> http://www.open-lab.net/blog/?p=98486 2025-04-07T20:32:01Z 2025-04-07T17:55:39Z Nearly 300,000 women across the globe die each year due to complications arising from pregnancy or childbirth. The number of stillborns and babies that die...]]>

Nearly 300,000 women across the globe die each year due to complications arising from pregnancy or childbirth. The number of stillborns and babies that die within their first month tops nearly 4M every year. April 7 marks World Health Day, which this year focuses on raising awareness about efforts to end preventable maternal and newborn deaths. Giving women and infants better access to…

Source

]]>
Sama Bali <![CDATA[Event: HP & NVIDIA Developer Challenge]]> http://www.open-lab.net/blog/?p=98487 2025-04-07T23:14:53Z 2025-04-07T17:54:00Z Join the hackathon to build open-source AI solutions, optimize models, enhance workflows, connect with peers, and win prizes.]]>

Join the hackathon to build open-source AI solutions, optimize models, enhance workflows, connect with peers, and win prizes.

Source

]]>
Anu Srivastava <![CDATA[NVIDIA Accelerates Inference on Meta Llama 4 Scout and Maverick]]> http://www.open-lab.net/blog/?p=98468 2025-04-06T02:18:37Z 2025-04-06T02:18:34Z The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can...]]>

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can achieve over 40K output tokens per second on NVIDIA Blackwell B200 GPUs, and are available to try as NVIDIA NIM microservices. The Llama 4 models are now natively multimodal and multilingual using a mixture-of-experts (MoE) architecture.

Source

]]>
Ashraf Eassa <![CDATA[NVIDIA Blackwell Delivers Massive Performance Leaps in MLPerf Inference v5.0]]> http://www.open-lab.net/blog/?p=98367 2025-04-03T18:48:39Z 2025-04-02T18:14:48Z The compute demands for large language model (LLM) inference are growing rapidly, fueled by the combination of growing model sizes, real-time latency...]]>

The compute demands for large language model (LLM) inference are growing rapidly, fueled by the combination of growing model sizes, real-time latency requirements, and, most recently, AI reasoning. At the same time, as AI adoption grows, the ability of an AI factory to serve as many users as possible, all while maintaining good per-user experiences, is key to maximizing the value it generates.

Source

]]>
Vinh Nguyen <![CDATA[LLM Benchmarking: Fundamental Concepts]]> http://www.open-lab.net/blog/?p=98215 2025-04-03T18:44:20Z 2025-04-02T17:00:00Z The past few years have witnessed the rise in popularity of generative AI and large language models (LLMs), as part of a broad AI revolution. As LLM-based...]]>

The past few years have witnessed the rise in popularity of generative AI and large language models (LLMs), as part of a broad AI revolution. As LLM-based applications are rolled out across enterprises, there is a need to determine the cost efficiency of different AI serving solutions. The cost of an LLM application deployment depends on how many queries it can process per second while being…

Source

]]>
Arun Raman <![CDATA[Deploying the NVIDIA AI Blueprint for Cost-Efficient LLM Routing]]> http://www.open-lab.net/blog/?p=98006 2025-04-03T18:45:52Z 2025-03-26T22:01:20Z Since the release of ChatGPT in November 2022, the capabilities of large language models (LLMs) have surged, and the number of available models has grown...]]>

Since the release of ChatGPT in November 2022, the capabilities of large language models (LLMs) have surged, and the number of available models has grown exponentially. With this expansion, LLMs now vary widely in cost, performance, and specialization. For example, straightforward tasks like text summarization can be efficiently handled by smaller, general-purpose models. In contrast…

Source

]]>
Cole Swain <![CDATA[Spotlight: Tomorrow.io?Transforms Global Weather Resilience with NVIDIA AI]]> http://www.open-lab.net/blog/?p=98023 2025-04-03T18:46:17Z 2025-03-26T21:19:34Z From hyperlocal forecasts that guide daily operations to planet-scale models illuminating new climate insights, the world is entering a new frontier in weather...]]>

From hyperlocal forecasts that guide daily operations to planet-scale models illuminating new climate insights, the world is entering a new frontier in weather and climate resilience. The combination of space-based observations and GPU-accelerated AI delivers near-instant, context-rich insights to enterprises, governments, researchers, and solution providers worldwide. It also marks a rare…

Source

]]>
Wen Jie Ong <![CDATA[Accelerating the Future of Transportation with SES AI��s NVIDIA-Powered Innovation for Electric Vehicles]]> http://www.open-lab.net/blog/?p=97805 2025-03-25T17:36:45Z 2025-03-25T16:00:00Z Electric vehicles (EVs) are transforming transportation, but challenges such as cost, longevity, and range remain barriers to widespread adoption. At the heart...]]>

Electric vehicles (EVs) are transforming transportation, but challenges such as cost, longevity, and range remain barriers to widespread adoption. At the heart of these challenges lies battery technology—specifically, the electrolyte, a critical component that enables energy storage and delivery. The electrolyte’s properties directly impact a battery’s charging speed, power output, stability…

Source

]]>
1
Annamalai Chockalingam <![CDATA[Kickstart Your AI Journey on RTX AI PCs and Workstations with NVIDIA NIM Microservices]]> http://www.open-lab.net/blog/?p=97991 2025-04-03T18:47:34Z 2025-03-25T13:00:00Z With emerging use cases such as digital humans, agents, podcasts, images, and video generation, generative AI is changing the way we interact with PCs. This...]]>

With emerging use cases such as digital humans, agents, podcasts, images, and video generation, generative AI is changing the way we interact with PCs. This paradigm shift calls for new ways of interfacing with and programming generative AI models. However, getting started can be daunting for PC developers and AI enthusiasts. Today, NVIDIA released a suite of NVIDIA NIM microservices on…

Source

]]>
Uttara Kumar <![CDATA[Boost Llama Model Performance on Microsoft Azure AI Foundry with NVIDIA TensorRT-LLM]]> http://www.open-lab.net/blog/?p=97008 2025-03-20T18:45:42Z 2025-03-20T15:00:00Z Microsoft, in collaboration with NVIDIA, announced transformative performance improvements for the Meta Llama family of models on its Azure AI Foundry platform....]]>

Microsoft, in collaboration with NVIDIA, announced transformative performance improvements for the Meta Llama family of models on its Azure AI Foundry platform. These advancements, enabled by NVIDIA TensorRT-LLM optimizations, deliver significant gains in throughput, reduced latency, and improved cost efficiency, all while preserving the quality of model outputs. With these improvements…

Source

]]>
Dave Salvator <![CDATA[NVIDIA Blackwell Ultra for the Era of AI Reasoning]]> http://www.open-lab.net/blog/?p=96761 2025-03-20T22:34:30Z 2025-03-19T18:00:15Z For years, advancements in AI have followed a clear trajectory through pretraining scaling: larger models, more data, and greater computational resources lead...]]>

For years, advancements in AI have followed a clear trajectory through pretraining scaling: larger models, more data, and greater computational resources lead to breakthrough capabilities. In the last 5 years, pretraining scaling has increased compute requirements at an incredible rate of 50M times. However, building more intelligent systems is no longer just about pretraining bigger models.

Source

]]>
Jonathan Ferrer Mestres <![CDATA[NVIDIA Earth-2 Powers Regional AI Weather Forecasting in the United Arab Emirates]]> http://www.open-lab.net/blog/?p=97074 2025-03-31T20:49:26Z 2025-03-19T16:01:00Z In the United Arab Emirates (UAE), extreme weather events disrupt daily life, delaying flights, endangering transportation, and complicating urban planning....]]>

In the United Arab Emirates (UAE), extreme weather events disrupt daily life, delaying flights, endangering transportation, and complicating urban planning. High daytime temperatures limit human activity outdoors, while dense nighttime fog is a frequent cause of severe and often fatal car crashes. Meanwhile, 2024 saw the heaviest precipitation event in the country in 75 years…

Source

]]>
Michael Zephyr <![CDATA[MONAI Integrates Advanced Agentic Architectures to Establish Multimodal Medical AI Ecosystem]]> http://www.open-lab.net/blog/?p=97638 2025-03-20T17:09:39Z 2025-03-19T16:00:00Z The growing volume and complexity of medical data��and the pressing need for early disease diagnosis and improved healthcare efficiency��are driving...]]>

The growing volume and complexity of medical data—and the pressing need for early disease diagnosis and improved healthcare efficiency—are driving unprecedented advancements in medical AI. Among the most transformative innovations in this field are multimodal AI models that simultaneously process text, images, and video. These models offer a more comprehensive understanding of patient data than…

Source

]]>
Kyle Tretina <![CDATA[Guiding Generative Molecular Design with Experimental Feedback Using Oracles]]> http://www.open-lab.net/blog/?p=96966 2025-03-25T17:23:57Z 2025-03-19T15:00:00Z Generative chemistry with AI has the potential to revolutionize how scientists approach drug discovery and development, health, and materials science and...]]>

Generative chemistry with AI has the potential to revolutionize how scientists approach drug discovery and development, health, and materials science and engineering. Instead of manually designing molecules with “chemical intuition” or screening millions of existing chemicals, researchers can train neural networks to propose novel molecular structures tailored to the desired properties.

Source

]]>
TJ Chen <![CDATA[Shrink Genomics and Single-Cell Analysis Time to Minutes with NVIDIA Parabricks and NVIDIA AI Blueprints]]> http://www.open-lab.net/blog/?p=96979 2025-03-20T18:33:12Z 2025-03-19T15:00:00Z NVIDIA Parabricks is a scalable genomics analysis software suite that solves omics challenges with accelerated computing and deep learning to unlock new...]]>

NVIDIA Parabricks is a scalable genomics analysis software suite that solves omics challenges with accelerated computing and deep learning to unlock new scientific breakthroughs. Released at NVIDIA GTC 2025, NVIDIA Parabricks v4.5 supports the growing quantity of data by including support for the latest NVIDIA GPU architectures, and improved alignment and variant calling with the…

Source

]]>
Hao Wang <![CDATA[Petabyte-Scale Video Processing with NVIDIA NeMo Curator on NVIDIA DGX Cloud]]> http://www.open-lab.net/blog/?p=97031 2025-03-20T17:07:03Z 2025-03-18T19:22:51Z With the rise of physical AI, video content generation has surged exponentially. A single camera-equipped autonomous vehicle can generate more than 1 TB of...]]>

With the rise of physical AI, video content generation has surged exponentially. A single camera-equipped autonomous vehicle can generate more than 1 TB of video daily, while a robotics-powered manufacturing facility may produce 1 PB of data daily. To leverage this data for training and fine-tuning world foundation models (WFMs), you must first process it efficiently.

Source

]]>
3
Ruchika Kharwar <![CDATA[NVIDIA NeMo Retriever Delivers Accurate Multimodal PDF Data Extraction 15x Faster]]> http://www.open-lab.net/blog/?p=97161 2025-03-21T19:15:59Z 2025-03-18T19:20:51Z Enterprises are generating and storing more multimodal data than ever before, yet traditional retrieval systems remain largely text-focused. While they can...]]>

Enterprises are generating and storing more multimodal data than ever before, yet traditional retrieval systems remain largely text-focused. While they can surface insights from written content, they aren’t extracting critical information embedded in tables, charts, and infographics—often the most information-dense elements of a document. Without a multimodal retrieval system…

Source

]]>
Christian Munley <![CDATA[Improve AI Code Generation Using NVIDIA AgentIQ Open-Source Toolkit]]> http://www.open-lab.net/blog/?p=96937 2025-03-21T20:20:35Z 2025-03-18T19:07:50Z With the release of NVIDIA AgentIQ��an open-source library for connecting and optimizing teams of AI agents��developers, professionals, and researchers can...]]>

With the release of NVIDIA AgentIQ—an open-source library for connecting and optimizing teams of AI agents—developers, professionals, and researchers can create their own agentic AI applications. This tutorial shows you how to develop apps in AgentIQ through an example of AI code generation. We build a test-driven coding agent using LangGraph and reasoning models to scale test-time computation.

Source

]]>
Sylendran Arunagiri <![CDATA[Maximize AI Agent Performance with Data Flywheels Using NVIDIA NeMo Microservices]]> http://www.open-lab.net/blog/?p=97046 2025-03-20T17:02:45Z 2025-03-18T19:05:30Z As agentic AI systems evolve and become essential for optimizing business processes, it is crucial for developers to update them regularly to stay aligned with...]]>

As agentic AI systems evolve and become essential for optimizing business processes, it is crucial for developers to update them regularly to stay aligned with ever-changing business and user needs. Continuously refining these agents with AI and human feedback ensures that they remain effective and relevant. NVIDIA NeMo microservices is a fully accelerated, enterprise-grade solution designed…

Source

]]>
Amr Elmeleegy <![CDATA[Introducing NVIDIA Dynamo, A Low-Latency Distributed Inference Framework for Scaling Reasoning AI Models]]> http://www.open-lab.net/blog/?p=95274 2025-03-24T20:52:54Z 2025-03-18T17:50:00Z NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for...]]>

NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for deploying generative AI and reasoning models in large-scale distributed environments. The framework boosts the number of requests served by up to 30x, when running the open-source DeepSeek-R1 models on NVIDIA Blackwell.

Source

]]>
Ashraf Eassa <![CDATA[NVIDIA Blackwell Delivers World-Record DeepSeek-R1 Inference Performance]]> http://www.open-lab.net/blog/?p=97352 2025-03-24T18:58:36Z 2025-03-18T17:41:42Z NVIDIA announced world-record DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system with eight NVIDIA Blackwell GPUs can achieve over...]]>

NVIDIA announced world-record DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system with eight NVIDIA Blackwell GPUs can achieve over 250 tokens per second per user or a maximum throughput of over 30,000 tokens per second on the massive, state-of-the-art 671 billion parameter DeepSeek-R1 model. These rapid advancements in performance at both ends of the performance…

Source

]]>
1
Pranjali Joshi <![CDATA[Scale Synthetic Data and Physical AI Reasoning with NVIDIA Cosmos World Foundation Models]]> http://www.open-lab.net/blog/?p=97132 2025-03-20T22:36:58Z 2025-03-18T16:00:47Z The next generation of AI-driven robots like humanoids and autonomous vehicles depends on high-fidelity, physics-aware training data. Without diverse and...]]>

The next generation of AI-driven robots like humanoids and autonomous vehicles depends on high-fidelity, physics-aware training data. Without diverse and representative datasets, these systems don’t get proper training and face testing risks due to poor generalization, limited exposure to real-world variations, and unpredictable behavior in edge cases. Collecting massive real-world datasets for…

Source

]]>
Allyson Vasquez <![CDATA[NVIDIA RTX Advances with Neural Rendering and Digital Human Technologies at GDC 2025]]> http://www.open-lab.net/blog/?p=97390 2025-03-20T17:01:46Z 2025-03-18T00:00:00Z AI is transforming how we experience our favorite games. It is unlocking new levels of visuals, performance, and gameplay possibilities with neural rendering...]]>

AI is transforming how we experience our favorite games. It is unlocking new levels of visuals, performance, and gameplay possibilities with neural rendering and generative AI-powered characters. With game development becoming more complex, AI is also playing a role in helping artists and engineers realize their creative visions. At GDC 2025, NVIDIA is building upon NVIDIA RTX Kit…

Source

]]>
3
Anu Srivastava <![CDATA[Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance]]> http://www.open-lab.net/blog/?p=96770 2025-03-12T16:20:01Z 2025-03-12T08:45:00Z Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...]]>

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit all for developers managing cost and user experience when bringing generative AI capability to the rapidly growing ecosystem of AI-powered applications. You need options for high-quality, customizable models that can support large…

Source

]]>
Shubham Agrawal <![CDATA[Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization]]> http://www.open-lab.net/blog/?p=96842 2025-03-12T22:08:59Z 2025-03-11T17:30:00Z With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of...]]>

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of highly sophisticated perception and deep contextual understanding. These intelligent solutions offer a promising means of enhancing semantic comprehension in XR settings. By integrating VLMs, developers can significantly improve how XR…

Source

]]>
Chen Fu <![CDATA[Streamline LLM Deployment for Autonomous Vehicle Applications with NVIDIA DriveOS LLM SDK]]> http://www.open-lab.net/blog/?p=96776 2025-03-07T20:13:46Z 2025-03-10T19:30:00Z Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of...]]>

Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of applications, including translation, digital assistants, recommendation systems, context analysis, code generation, cybersecurity, and more. In automotive applications, there is growing demand for LLM-based solutions for both autonomous driving and…

Source

]]>
2
Shelby Thomas <![CDATA[Ensuring Reliable Model Training on NVIDIA DGX Cloud]]> http://www.open-lab.net/blog/?p=96789 2025-03-24T18:36:43Z 2025-03-10T16:26:44Z Training AI models on massive GPU clusters presents significant challenges for model builders. Because manual intervention becomes impractical as job scale...]]>

Training AI models on massive GPU clusters presents significant challenges for model builders. Because manual intervention becomes impractical as job scale increases, automation is critical to maintaining high GPU utilization and training productivity. An exceptional training experience requires resilient systems that provide low-latency error attribution and automatic fail over based on root…

Source

]]>
Michelle Horton <![CDATA[Top Agentic AI Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96836 2025-03-07T00:33:46Z 2025-03-07T00:33:44Z Learn from and connect with leading AI developers building the next generation of AI agents.]]>

Learn from and connect with leading AI developers building the next generation of AI agents.

Source

]]>
Tanay Varshney <![CDATA[How Using a Reranking Microservice Can Improve Accuracy and Costs of Information Retrieval]]> http://www.open-lab.net/blog/?p=96363 2025-03-06T20:05:47Z 2025-03-06T18:33:38Z Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents,...]]>

Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents, and AI assistants. These systems demand retrieval processes that are accurate and computationally efficient to deliver precise insights, enhance user experiences, and maintain scalability. Retrieval-augmented generation (RAG) is used to…

Source

]]>
Michelle Horton <![CDATA[Top Physical AI and Robotics Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96765 2025-03-06T19:26:33Z 2025-03-06T00:59:23Z Join these sessions to learn how accelerated computing, generative AI, and physics-based world simulation are advancing physical and embodied AI.]]>

Join these sessions to learn how accelerated computing, generative AI, and physics-based world simulation are advancing physical and embodied AI.

Source

]]>
Michelle Horton <![CDATA[Top Generative AI Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96689 2025-03-06T19:51:57Z 2025-03-03T23:45:42Z Discover cutting-edge AI and data science innovations from top generative AI teams at NVIDIA GTC 2025.]]>

Discover cutting-edge AI and data science innovations from top generative AI teams at NVIDIA GTC 2025.

Source

]]>
Aditi Bodhankar <![CDATA[Measuring the Effectiveness and Performance of AI Guardrails in Generative AI Applications]]> http://www.open-lab.net/blog/?p=96562 2025-03-06T19:26:38Z 2025-03-03T17:22:09Z Safeguarding AI agents and other conversational AI applications to ensure safe, on-brand and reliable behavior is essential for enterprises. NVIDIA NeMo...]]>

Safeguarding AI agents and other conversational AI applications to ensure safe, on-brand and reliable behavior is essential for enterprises. NVIDIA NeMo Guardrails offers robust protection with AI guardrails for content safety, topic control, jailbreak detection, and more to evaluate and optimize guardrail performance. In this post, we explore techniques for measuring and optimizing your AI…

Source

]]>
Mehran Maghoumi <![CDATA[Build an AI Agent with Expert Reasoning Capabilities Using the DeepSeek-R1 NIM]]> http://www.open-lab.net/blog/?p=96030 2025-03-06T19:52:48Z 2025-02-28T20:23:51Z AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on...]]>

AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on expert reasoning, enabling smarter planning and efficient execution. Agentic AI applications could benefit from the capabilities of models such as DeepSeek-R1. Built for solving problems that require advanced AI reasoning…

Source

]]>
Sangjune Park <![CDATA[Spotlight: NAVER Place Optimizes SLM-Based Vertical Services with NVIDIA TensorRT-LLM]]> http://www.open-lab.net/blog/?p=96279 2025-03-18T18:24:30Z 2025-02-28T17:57:49Z NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of...]]>

As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of businesses and points of interest across Korea. Users can search about different places, leave reviews, and place bookings or orders in real time.

Source

]]>
Anu Srivastava <![CDATA[Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs]]> http://www.open-lab.net/blog/?p=96519 2025-03-06T19:26:43Z 2025-02-26T22:05:00Z Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...]]>

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical for the current resource constraints that many companies have. The rise of small language models (SLMs) bridge quality and cost by creating models with a smaller resource footprint. SLMs are a subset of language models that tend to…

Source

]]>
Francesco Ciannella <![CDATA[Building a Simple VLM-Based Multimodal Information Retrieval System with NVIDIA NIM]]> http://www.open-lab.net/blog/?p=96151 2025-03-06T19:26:45Z 2025-02-26T17:00:00Z In today��s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined,...]]>

In today’s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined, effective solutions for quick deployments, prototyping, or experimentation. One of the key challenges in information retrieval is managing the diverse modalities in unstructured datasets, including text, PDFs, images, tables, audio, video…

Source

]]>
1
Yifan Wu <![CDATA[Accelerating Scientific Literature Reviews with NVIDIA NIM Microservices for LLMs]]> http://www.open-lab.net/blog/?p=96324 2025-03-06T19:26:43Z 2025-02-26T17:00:00Z A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a...]]>

A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a structured overview of the domain. For experts, it refines their understanding and sparks new ideas. In 2024 alone, 218,650 review articles were indexed in the Web of Science database, highlighting the importance of these resources in research.

Source

]]>
Shubham Agrawal <![CDATA[Vision Language Model Prompt Engineering Guide for Image and Video Understanding]]> http://www.open-lab.net/blog/?p=96229 2025-03-06T19:26:45Z 2025-02-26T16:25:34Z Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual...]]>

Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual understanding to large language models (LLMs) through the use of a vision encoder. These initial VLMs were limited in their abilities, only able to understand text and single image inputs. Fast-forward a few years and VLMs are now capable of…

Source

]]>
Mark Ren <![CDATA[Configurable Graph-Based Task Solving with the Marco Multi-AI Agent Framework for Chip Design]]> http://www.open-lab.net/blog/?p=96209 2025-03-06T19:26:47Z 2025-02-25T22:17:28Z Chip and hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around...]]>

Chip and hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around time (TAT) for optimizing performance, power, area, and cost (PPAC) during synthesis, verification, physical design, and reliability loops. Large language models (LLMs) have shown a remarkable capacity to comprehend and generate natural…

Source

]]>
Leon Derczynski <![CDATA[Defining LLM Red Teaming]]> http://www.open-lab.net/blog/?p=96239 2025-03-06T19:26:48Z 2025-02-25T18:49:26Z There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to...]]>

There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to deviate from acceptable standards. This use of LLMs began in 2023 and has rapidly evolved to become a common industry practice and a cornerstone of trustworthy AI. How can we standardize and define LLM red teaming?

Source

]]>
Rich Harang <![CDATA[Agentic Autonomy Levels and Security]]> http://www.open-lab.net/blog/?p=96341 2025-03-06T19:26:49Z 2025-02-25T18:45:05Z Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable...]]>

Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable AI models to use tools to access additional data or automate user actions, and enable AI models to operate autonomously, analyzing and performing complex tasks with a minimum of human involvement or interaction. Because of their power…

Source

]]>
Joe Bungo <![CDATA[NVIDIA Deep Learning Institute Releases New Generative AI Teaching Kit]]> http://www.open-lab.net/blog/?p=88388 2025-03-06T19:26:50Z 2025-02-25T17:47:49Z Generative AI, powered by advanced machine learning models and deep neural networks, is revolutionizing industries by generating novel content and driving...]]>

Generative AI, powered by advanced machine learning models and deep neural networks, is revolutionizing industries by generating novel content and driving innovation in fields like healthcare, finance, and entertainment. NVIDIA is leading this transformation with its cutting-edge GPU architectures and software ecosystems, such as the H100 Tensor Core GPU and CUDA platform…

Source

]]>
6
Charu Chaubal <![CDATA[NVIDIA AI Enterprise Adds Support for NVIDIA H200 NVL]]> http://www.open-lab.net/blog/?p=96424 2025-03-06T19:26:52Z 2025-02-24T22:37:47Z NVIDIA AI Enterprise is the cloud-native software platform for the development and deployment of production-grade AI solutions. The latest release of the NVIDIA...]]>

NVIDIA AI Enterprise is the cloud-native software platform for the development and deployment of production-grade AI solutions. The latest release of the NVIDIA AI Enterprise infrastructure software collection adds support for the latest NVIDIA data center GPU, NVIDIA H200 NVL, giving your enterprise new options for powering cutting-edge use cases such as agentic and generative AI with some of the…

Source

]]>
Sama Bali <![CDATA[Transforming Product Design Workflows in Manufacturing with Generative AI]]> http://www.open-lab.net/blog/?p=96242 2025-03-06T19:26:54Z 2025-02-20T19:32:11Z Traditional design and engineering workflows in the manufacturing industry have long been characterized by a sequential, iterative approach that is often...]]>

Traditional design and engineering workflows in the manufacturing industry have long been characterized by a sequential, iterative approach that is often time-consuming and resource intensive. These conventional methods typically involve stages such as requirement gathering, conceptual design, detailed design, analysis, prototyping, and testing, with each phase dependent on the results of previous…

Source

]]>
Sven Chilton <![CDATA[Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT]]> http://www.open-lab.net/blog/?p=95339 2025-03-06T19:26:55Z 2025-02-20T18:54:48Z NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a...]]>

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a collection of GPU-accelerated speech and translation AI microservices for ASR, TTS, and NMT, support English-Spanish and English-Japanese code-switching ASR models based on the Conformer architecture, along with a model supporting multiple…

Source

]]>
Tanya Lenz <![CDATA[Upcoming Livestream: Using the NVIDIA AI Blueprint for PDF to Podcast?]]> http://www.open-lab.net/blog/?p=96307 2025-02-20T18:11:39Z 2025-02-20T18:11:37Z Join us on February 27 to learn how to transform PDFs into AI podcasts using the NVIDIA AI Blueprint.]]>

Join us on February 27 to learn how to transform PDFs into AI podcasts using the NVIDIA AI Blueprint.

Source

]]>
Allyson Vasquez <![CDATA[Bring NVIDIA ACE AI Characters to Games with the New In-Game Inferencing SDK]]> http://www.open-lab.net/blog/?p=96051 2025-02-20T21:43:57Z 2025-02-20T17:00:00Z NVIDIA ACE is a suite of digital human technologies that bring game characters and digital assistants to life with generative AI. ACE on-device models enable...]]>

Source

]]>
Nitzan Simchi <![CDATA[Spotlight: Drug Discovery Startup Protai Advances Complex Structure Prediction with AlphaFold, Proteomics, and NVIDIA NIM]]> http://www.open-lab.net/blog/?p=96107 2025-02-20T15:51:52Z 2025-02-19T17:30:00Z Generative AI, especially with breakthroughs like AlphaFold and RosettaFold, is transforming drug discovery and how biotech companies and research laboratories...]]>

Generative AI, especially with breakthroughs like AlphaFold and RosettaFold, is transforming drug discovery and how biotech companies and research laboratories study protein structures, unlocking groundbreaking insights into protein interactions. Proteins are dynamic entities. It has been postulated that a protein’s native state is known by its sequence of amino acids alone…

Source

]]>
Kyle Tretina <![CDATA[Understanding the Language of Life��s Biomolecules Across Evolution at a New Scale with Evo 2]]> http://www.open-lab.net/blog/?p=95589 2025-02-20T15:52:05Z 2025-02-19T17:14:51Z AI has evolved from an experimental curiosity to a driving force within biological research. The convergence of deep learning algorithms, massive omics...]]>

AI has evolved from an experimental curiosity to a driving force within biological research. The convergence of deep learning algorithms, massive omics datasets, and automated laboratory workflows has allowed scientists to tackle problems once thought intractable—from rapid protein structure prediction to generative drug design, increasing the need for AI literacy among scientists.

Source

]]>
Brad Nemire <![CDATA[Featured Sessions for Students at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96181 2025-02-20T15:52:32Z 2025-02-15T02:00:58Z Learn from researchers, scientists, and industry leaders across a variety of topics including AI, robotics, and Data Science.]]>

Learn from researchers, scientists, and industry leaders across a variety of topics including AI, robotics, and Data Science.

Source

]]>
Anjali Shah <![CDATA[Optimizing Qwen2.5-Coder Throughput with NVIDIA TensorRT-LLM Lookahead Decoding]]> http://www.open-lab.net/blog/?p=96010 2025-02-20T15:52:43Z 2025-02-14T18:19:37Z Large language models (LLMs) that specialize in coding have been steadily adopted into developer workflows. From pair programming to self-improving AI agents,...]]>

Large language models (LLMs) that specialize in coding have been steadily adopted into developer workflows. From pair programming to self-improving AI agents, these models assist developers with various tasks, including enhancing code, fixing bugs, generating tests, and writing documentation. To promote the development of open-source LLMs, the Qwen team recently released Qwen2.5-Coder…

Source

]]>
Joanne Chang <![CDATA[Upcoming Webinar: Unlocking Video Analytics With AI Agents]]> http://www.open-lab.net/blog/?p=96135 2025-02-20T15:52:55Z 2025-02-13T22:05:57Z Master prompt engineering, fine-tuning, and customization to build video analytics AI agents.]]>

Master prompt engineering, fine-tuning, and customization to build video analytics AI agents.

Source

]]>
Terry Chen <![CDATA[Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling]]> http://www.open-lab.net/blog/?p=95998 2025-02-20T15:56:57Z 2025-02-12T18:00:00Z As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is...]]>

As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is emerging. Also known as AI reasoning or long-thinking, this technique improves model performance by allocating additional computational resources during inference to evaluate multiple possible outcomes and then selecting the best one…

Source

]]>
2
Gomathy Venkata Krishnan <![CDATA[LLM Model Pruning and Knowledge Distillation with NVIDIA NeMo Framework]]> http://www.open-lab.net/blog/?p=93451 2025-02-20T15:54:00Z 2025-02-12T17:54:52Z Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. ...]]>

Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. The How to Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model post discussed the best practices of using large language models (LLMs) that combine depth, width, attention, and MLP pruning with knowledge distillation…

Source

]]>
Emily Potyraj <![CDATA[NVIDIA DGX Cloud Introduces Ready-To-Use Templates to Benchmark AI Platform Performance]]> http://www.open-lab.net/blog/?p=95558 2025-02-20T15:54:23Z 2025-02-11T17:00:00Z In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a...]]>

In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a comprehensive evaluation of the entire stack, from compute to networking to model framework. Navigating the complexities of AI system performance can be difficult. There are many application changes that you can make…

Source

]]>
Brad Nemire <![CDATA[Featured Researcher and Educator Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=95817 2025-02-06T19:33:45Z 2025-02-05T23:03:06Z Explore the latest advancements in academia, including advanced research, innovative teaching methods, and the future of learning and technology.]]>

Explore the latest advancements in academia, including advanced research, innovative teaching methods, and the future of learning and technology.

Source

]]>
Cheng-Han (Hank) Du <![CDATA[Improving Translation Quality with Domain-Specific Fine-Tuning and NVIDIA NIM]]> http://www.open-lab.net/blog/?p=95756 2025-02-06T19:33:46Z 2025-02-05T21:30:00Z Translation plays an essential role in enabling companies to expand across borders, with requirements varying significantly in terms of tone, accuracy, and...]]>

Translation plays an essential role in enabling companies to expand across borders, with requirements varying significantly in terms of tone, accuracy, and technical terminology handling. The emergence of sovereign AI has highlighted critical challenges in large language models (LLMs), particularly their struggle to capture nuanced cultural and linguistic contexts beyond English-dominant…

Source

]]>
1
Pradeep Ramani <![CDATA[OpenAI Triton on NVIDIA Blackwell Boosts AI Performance and Programmability]]> http://www.open-lab.net/blog/?p=95388 2025-02-06T19:33:47Z 2025-02-05T18:00:00Z Matrix multiplication and attention mechanisms are the computational backbone of modern AI workloads. While libraries like NVIDIA cuDNN provide highly optimized...]]>

Matrix multiplication and attention mechanisms are the computational backbone of modern AI workloads. While libraries like NVIDIA cuDNN provide highly optimized implementations, and frameworks such as CUTLASS offer deep customization, many developers and researchers need a middle ground that combines performance with programmability. The open-source Triton compiler on the NVIDIA Blackwell…

Source

]]>
Shruthii Sathyanarayanan <![CDATA[Streamline Collaboration Across Local and Cloud Systems with NVIDIA AI Workbench]]> http://www.open-lab.net/blog/?p=95720 2025-02-06T19:35:45Z 2025-02-05T18:00:00Z NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a...]]>

NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a frictionless experience across PCs, workstations, servers, and cloud for AI, data science, and machine learning (ML) projects. The user experience includes: This post provides details about the January 2025 release of NVIDIA AI Workbench…

Source

]]>
Isabel Hulseman <![CDATA[New NVIDIA AI Blueprint: Build a Customizable RAG Pipeline]]> http://www.open-lab.net/blog/?p=95614 2025-02-13T20:44:16Z 2025-01-30T22:26:12Z Connect AI applications to enterprise data using embedding and reranking models for information retrieval.]]>

Connect AI applications to enterprise data using embedding and reranking models for information retrieval.

Source

]]>
Eric Phan <![CDATA[How to Integrate NVIDIA DLSS 4 into Your Game with NVIDIA Streamline]]> http://www.open-lab.net/blog/?p=95492 2025-02-06T19:33:58Z 2025-01-30T14:00:00Z NVIDIA DLSS 4 is the latest iteration of DLSS introduced with the NVIDIA GeForce RTX 50 Series GPUs. It includes several new features: DLSS Multi Frame...]]>

NVIDIA DLSS 4 is the latest iteration of DLSS introduced with the NVIDIA GeForce RTX 50 Series GPUs. It includes several new features: Here’s how you can get started with DLSS 4 in your integrations. This post focuses on the Streamline SDK, which provides a plug-and-play framework for simplified plugin integration. The NVIDIA Streamline SDK is an open-source framework that…

Source

]]>
Annamalai Chockalingam <![CDATA[New AI SDKs and Tools Released for NVIDIA Blackwell GeForce RTX 50 Series GPUs]]> http://www.open-lab.net/blog/?p=95526 2025-02-06T19:33:57Z 2025-01-30T14:00:00Z NVIDIA recently announced a new generation of PC GPUs��the GeForce RTX 50 Series��alongside new AI-powered SDKs and tools for developers. Powered by the...]]>

NVIDIA recently announced a new generation of PC GPUs—the GeForce RTX 50 Series—alongside new AI-powered SDKs and tools for developers. Powered by the NVIDIA Blackwell architecture, fifth-generation Tensor Cores and fourth-generation RT Cores, the GeForce RTX 50 Series delivers breakthroughs in AI-driven rendering, including neural shaders, digital human technologies, geometry and lighting.

Source

]]>
Amit Bleiweiss <![CDATA[Mastering LLM Techniques: Evaluation]]> http://www.open-lab.net/blog/?p=95447 2025-02-17T05:21:53Z 2025-01-29T20:44:06Z Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and...]]>

Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and multifaceted nature of these systems. Unlike traditional machine learning (ML) models, LLMs generate a wide range of diverse and often unpredictable outputs, making standard evaluation metrics insufficient. Key challenges include the…

Source

]]>
Edoardo Maria Ponti <![CDATA[Dynamic Memory Compression]]> http://www.open-lab.net/blog/?p=93500 2025-02-06T19:34:01Z 2025-01-24T17:43:42Z Despite the success of large language models (LLMs) as general-purpose AI tools, their high demand for computational resources make their deployment challenging...]]>

Despite the success of large language models (LLMs) as general-purpose AI tools, their high demand for computational resources make their deployment challenging in many real-world scenarios. The sizes of the model and conversation state are limited by the available high-bandwidth memory, limiting the number of users that can be served and the maximum conversation length. At present…

Source

]]>
Nick Comly <![CDATA[Optimize AI Inference Performance with NVIDIA Full-Stack Solutions]]> http://www.open-lab.net/blog/?p=95310 2025-03-18T18:18:44Z 2025-01-24T16:00:00Z The explosion of AI-driven applications has placed unprecedented demands on both developers, who must balance delivering cutting-edge performance with managing...]]>

As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. The explosion of AI-driven applications has placed unprecedented demands on both developers, who must balance delivering cutting-edge performance with managing operational complexity and cost, and AI infrastructure. NVIDIA is empowering developers with full-stack innovations—spanning chips, systems…

Source

]]>
Juana Nakfour <![CDATA[Horizontal Autoscaling of NVIDIA NIM Microservices on Kubernetes]]> http://www.open-lab.net/blog/?p=94972 2025-03-18T18:25:14Z 2025-01-22T17:34:51Z NVIDIA NIM microservices are model inference containers that can be deployed on Kubernetes. In a production environment, it��s important to understand the...]]>

As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. NVIDIA NIM microservices are model inference containers that can be deployed on Kubernetes. In a production environment, it’s important to understand the compute and memory profile of these microservices to set up a successful autoscaling plan. In this post, we describe how to set up and use Kubernetes Horizontal Pod…

Source

]]>
2
Chris Krapu <![CDATA[Lessons Learned from Building an AI Sales Assistant]]> http://www.open-lab.net/blog/?p=95231 2025-02-06T19:34:04Z 2025-01-21T20:34:41Z At NVIDIA, the Sales Operations team equips the Sales team with the tools and resources needed to bring cutting-edge hardware and software to market. Managing...]]>

At NVIDIA, the Sales Operations team equips the Sales team with the tools and resources needed to bring cutting-edge hardware and software to market. Managing this across NVIDIA’s diverse technology is a complex challenge shared by many enterprises. Through collaboration with our Sales team, we found that they rely on internal and external documentation…

Source

]]>
1
John Thomson <![CDATA[Introducing New KV Cache Reuse Optimizations in NVIDIA TensorRT-LLM]]> http://www.open-lab.net/blog/?p=95040 2025-02-06T19:34:05Z 2025-01-16T22:57:30Z Language models generate text by predicting the next token, given all the previous tokens including the input text tokens. Key and value elements of the...]]>

Language models generate text by predicting the next token, given all the previous tokens including the input text tokens. Key and value elements of the previous tokens are used as historical context in LLM serving for generation of the next set of tokens. Caching these key and value elements from previous tokens avoids expensive recomputation and effectively leads to higher throughput. However…

Source

]]>
Shashank Maheshwari <![CDATA[NVIDIA JetPack 6.2 Brings Super Mode to NVIDIA Jetson Orin Nano and Jetson Orin NX Modules]]> http://www.open-lab.net/blog/?p=95089 2025-02-06T19:38:26Z 2025-01-16T22:10:29Z The introduction of the NVIDIA Jetson Orin Nano Super Developer Kit sparked a new age of generative AI for small edge devices. The new Super Mode delivered an...]]>

The introduction of the NVIDIA Jetson Orin Nano Super Developer Kit sparked a new age of generative AI for small edge devices. The new Super Mode delivered an unprecedented generative AI performance boost of up to 1.7x on the developer kit, making it the most affordable generative AI supercomputer. JetPack 6.2 is now available to support Super Mode for Jetson Orin Nano and Jetson Orin NX…

Source

]]>
Aditi Bodhankar <![CDATA[How to Safeguard AI Agents for Customer Service with NVIDIA NeMo Guardrails]]> http://www.open-lab.net/blog/?p=94928 2025-02-04T19:53:15Z 2025-01-16T14:00:00Z AI agents present a significant opportunity for businesses to scale and elevate customer service and support interactions. By automating routine inquiries and...]]>

AI agents present a significant opportunity for businesses to scale and elevate customer service and support interactions. By automating routine inquiries and enhancing response times, these agents improve efficiency and customer satisfaction, helping organizations stay competitive. However, alongside these benefits, AI agents come with risks. Large language models (LLMs) are vulnerable to…

Source

]]>
Martin Cimmino <![CDATA[Continued Pretraining of State-of-the-Art LLMs for Sovereign AI and Regulated Industries with iGenius and NVIDIA DGX Cloud]]> http://www.open-lab.net/blog/?p=95012 2025-01-23T19:54:22Z 2025-01-16T12:00:00Z In recent years, large language models (LLMs) have achieved extraordinary progress in areas such as reasoning, code generation, machine translation, and...]]>

In recent years, large language models (LLMs) have achieved extraordinary progress in areas such as reasoning, code generation, machine translation, and summarization. However, despite their advanced capabilities, foundation models have limitations when it comes to domain-specific expertise such as finance or healthcare or capturing cultural and language nuances beyond English.

Source

]]>
Sama Bali <![CDATA[GPU Memory Essentials for AI Performance]]> http://www.open-lab.net/blog/?p=94979 2025-01-23T19:54:24Z 2025-01-15T16:00:00Z Generative AI has revolutionized how people bring ideas to life, and agentic AI represents the next leap forward in this technological evolution. By leveraging...]]>

Generative AI has revolutionized how people bring ideas to life, and agentic AI represents the next leap forward in this technological evolution. By leveraging sophisticated, autonomous reasoning and iterative planning, AI agents can tackle complex, multistep problems with remarkable efficiency. As AI continues to revolutionize industries, the demand for running AI models locally has surged.

Source

]]>
1
Harry Petty <![CDATA[Transforming Data Centers into AI Factories for the 5th Industrial Revolution]]> http://www.open-lab.net/blog/?p=94879 2025-01-23T19:54:25Z 2025-01-14T19:58:01Z In a recent DC Anti-Conference Live presentation, Wade Vinson, chief data center distinguished engineer at NVIDIA, shared insights based upon work by NVIDIA...]]>

In a recent DC Anti-Conference Live presentation, Wade Vinson, chief data center distinguished engineer at NVIDIA, shared insights based upon work by NVIDIA designing, building, and operating NVIDIA DGX SuperPOD multi-megawatt data centers since 2016. NVIDIA is helping make data centers more accessible, resource-efficient, energy-efficient, and business-efficient, as well as scalable to any…

Source

]]>
Nirmal Kumar Juluru <![CDATA[Enhancing Generative AI Model Accuracy with NVIDIA NeMo Curator]]> http://www.open-lab.net/blog/?p=94263 2025-01-23T19:54:27Z 2025-01-13T17:00:00Z In the rapidly evolving landscape of artificial intelligence, the quality of the data used for training models is paramount. High-quality data ensures that...]]>

In the rapidly evolving landscape of artificial intelligence, the quality of the data used for training models is paramount. High-quality data ensures that models are accurate, reliable, and capable of generalizing well across various applications. The recent NVIDIA webinar, Enhance Generative AI Model Accuracy with High-Quality Multimodal Data Processing, dove into the intricacies of data…

Source

]]>
Kyle Tretina <![CDATA[Evaluating GenMol as a Generalist Foundation Model for Molecular Generation]]> http://www.open-lab.net/blog/?p=94836 2025-01-23T19:54:29Z 2025-01-13T14:00:00Z Traditional computational drug discovery relies almost exclusively on highly task-specific computational models for hit identification and lead optimization....]]>

Traditional computational drug discovery relies almost exclusively on highly task-specific computational models for hit identification and lead optimization. Adapting these specialized models to new tasks requires substantial time, computational power, and expertise—challenges that grow when researchers simultaneously work across multiple targets or properties.

Source

]]>
Kyle Tretina <![CDATA[Accelerate Protein Engineering with the NVIDIA BioNeMo Blueprint for Generative Protein Binder Design]]> http://www.open-lab.net/blog/?p=94851 2025-01-23T19:54:28Z 2025-01-13T14:00:00Z Designing a therapeutic protein that specifically binds its target in drug discovery is a staggering challenge. Traditional workflows are often a painstaking...]]>

Designing a therapeutic protein that specifically binds its target in drug discovery is a staggering challenge. Traditional workflows are often a painstaking trial-and-error process—iterating through thousands of candidates, each synthesis and validation round taking months if not years. Considering the average human protein is 430 amino acids long, the number of possible designs translates to…

Source

]]>
Dan Su <![CDATA[Announcing Nemotron-CC: A Trillion-Token English Language Dataset for LLM Pretraining]]> http://www.open-lab.net/blog/?p=94818 2025-01-23T19:54:30Z 2025-01-09T19:20:16Z NVIDIA is excited to announce the release of Nemotron-CC, a 6.3-trillion-token English language Common Crawl dataset for pretraining highly accurate large...]]>

NVIDIA is excited to announce the release of Nemotron-CC, a 6.3-trillion-token English language Common Crawl dataset for pretraining highly accurate large language models (LLMs), including 1.9 trillion tokens of synthetically generated data. One of the keys to training state-of-the-art LLMs is a high-quality pretraining dataset, and recent top LLMs, such as the Meta Llama series…

Source

]]>
Brad Nemire <![CDATA[NVIDIA Project DIGITS, A Grace Blackwell AI Supercomputer On Your Desk]]> http://www.open-lab.net/blog/?p=94765 2025-01-23T19:54:30Z 2025-01-09T18:19:00Z Powered by the new GB10 Grace Blackwell Superchip, Project DIGITS can tackle large generative AI models of up to 200B parameters.]]>

Powered by the new GB10 Grace Blackwell Superchip, Project DIGITS can tackle large generative AI models of up to 200B parameters.

Source

]]>
5
Pranjali Joshi <![CDATA[Advancing Physical AI with NVIDIA Cosmos World Foundation Model Platform]]> http://www.open-lab.net/blog/?p=94577 2025-01-23T19:54:31Z 2025-01-09T17:42:06Z As robotics and autonomous vehicles advance, accelerating development of physical AI��which enables autonomous machines to perceive, understand, and perform...]]>

As robotics and autonomous vehicles advance, accelerating development of physical AI—which enables autonomous machines to perceive, understand, and perform complex actions in the physical world—has become essential. At the center of these systems are world foundation models (WFMs)—AI models that simulate physical states through physics-aware videos, enabling machines to make accurate decisions and…

Source

]]>
1
Brad Nemire <![CDATA[Upcoming Livestream: NVIDIA Developer Highlights from CES 2025]]> http://www.open-lab.net/blog/?p=94843 2025-01-23T19:54:32Z 2025-01-09T10:00:00Z Tune in January 16th at 9:00 AM PT for a live recap, followed by a Q&A of the latest developer announcements at CES 2025.]]>

Tune in January 16th at 9:00 AM PT for a live recap, followed by a Q&A of the latest developer announcements at CES 2025.

Source

]]>
Zeeshan Patel <![CDATA[Accelerate Custom Video Foundation Model Pipelines with New NVIDIA NeMo Framework Capabilities]]> http://www.open-lab.net/blog/?p=94541 2025-03-20T16:23:00Z 2025-01-07T16:00:00Z Generative AI has evolved from text-based models to multimodal models, with a recent expansion into video, opening up new potential uses across various...]]>

Generative AI has evolved from text-based models to multimodal models, with a recent expansion into video, opening up new potential uses across various industries. Video models can create new experiences for users or simulate scenarios for training autonomous agents at scale. They are helping revolutionize various industries including robotics, autonomous vehicles, and entertainment.

Source

]]>
Anish Maddipoti <![CDATA[One-Click Deployments for the Best of NVIDIA AI with NVIDIA Launchables]]> http://www.open-lab.net/blog/?p=94569 2025-01-23T19:54:34Z 2025-01-07T04:30:00Z AI development has become a core part of modern software engineering, and NVIDIA is committed to finding ways to bring optimized accelerated computing to every...]]>

AI development has become a core part of modern software engineering, and NVIDIA is committed to finding ways to bring optimized accelerated computing to every developer that wants to start experimenting with AI. To address this, we’ve been working on making the accelerated computing stack more accessible with NVIDIA Launchables: preconfigured GPU computing environments that enable you to…

Source

]]>
Samuel Ochoa <![CDATA[Build a Video Search and Summarization Agent with NVIDIA AI Blueprint]]> http://www.open-lab.net/blog/?p=86011 2025-02-13T20:44:57Z 2025-01-07T04:20:00Z This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications...]]>

This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications and their development workflow are typically built on fixed-function, limited models that are designed to detect and identify only a select set of predefined objects. With generative AI, NVIDIA NIM microservices…

Source

]]>
2
Akhil Docca <![CDATA[How to Build a Generative AI-Enabled Synthetic Data Pipeline for Perception-Based Physical AI]]> http://www.open-lab.net/blog/?p=86105 2025-01-09T19:23:08Z 2025-01-07T03:57:00Z Training physical AI models used to power autonomous machines, such as robots and autonomous vehicles, requires huge amounts of data. Acquiring large sets of...]]>

Training physical AI models used to power autonomous machines, such as robots and autonomous vehicles, requires huge amounts of data. Acquiring large sets of diverse training data can be difficult, time-consuming, and expensive. Data is often limited due to privacy restrictions or concerns, or simply may not exist for novel use cases. In addition, the available data may not apply to the full range…

Source

]]>
Chintan Patel <![CDATA[Llama Nemotron Models Accelerate Agentic AI Workflows with Accuracy and Efficiency]]> http://www.open-lab.net/blog/?p=94595 2025-01-09T19:23:09Z 2025-01-07T03:40:00Z Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and...]]>

Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and achieve complex goals. Agentic AI combines the power of large language models (LLMs) with advanced reasoning and planning capabilities, opening a world of possibilities across industries, from healthcare and finance to manufacturing and…

Source

]]>
Ike Nnoli <![CDATA[NVIDIA RTX Neural Rendering Introduces Next Era of AI-Powered Graphics Innovation]]> http://www.open-lab.net/blog/?p=94662 2025-02-03T21:14:21Z 2025-01-07T03:22:00Z NVIDIA today unveiled next-generation hardware for gamers, creators, and developers��the GeForce RTX 50 Series desktop and laptop GPUs. Alongside these GPUs,...]]>

NVIDIA today unveiled next-generation hardware for gamers, creators, and developers—the GeForce RTX 50 Series desktop and laptop GPUs. Alongside these GPUs, NVIDIA introduced NVIDIA RTX Kit, a suite of neural rendering technologies to ray trace games with AI, render scenes with immense geometry, and create game characters with lifelike visuals. RTX Kit enhances geometry, textures, materials…

Source

]]>
2
Katie Link <![CDATA[Build a Generative AI Medical Device Training Assistant with NVIDIA NIM Microservices]]> http://www.open-lab.net/blog/?p=94379 2024-12-20T19:55:30Z 2024-12-20T18:00:00Z Innovation in medical devices continues to accelerate, with a record number authorized by the FDA every year. When these new or updated devices are introduced...]]>

Innovation in medical devices continues to accelerate, with a record number authorized by the FDA every year. When these new or updated devices are introduced to clinicians and patients, they require training to use them properly and safely. Once in use, clinicians or patients may need help troubleshooting issues. Medical devices are often accompanied by lengthy and technically complex…

Source

]]>
Tom Balough <![CDATA[Enhance Your Training Data with New NVIDIA NeMo Curator Classifier Models]]> http://www.open-lab.net/blog/?p=94447 2024-12-19T23:08:12Z 2024-12-19T23:08:08Z Classifier models are specialized in categorizing data into predefined groups or classes, playing a crucial role in optimizing data processing pipelines for...]]>

Classifier models are specialized in categorizing data into predefined groups or classes, playing a crucial role in optimizing data processing pipelines for fine-tuning and pretraining generative AI models. Their value lies in enhancing data quality by filtering out low-quality or toxic data, ensuring only clean and relevant information feeds downstream processes. Beyond filtering…

Source

]]>
Sama Bali <![CDATA[Accelerating Film Production with Dell AI Factory and NVIDIA]]> http://www.open-lab.net/blog/?p=94350 2025-01-11T17:49:14Z 2024-12-19T18:26:03Z Filmmaking is an intricate and complex process that involves a diverse team of artists, writers, visual effects professionals, technicians, and countless other...]]>

Filmmaking is an intricate and complex process that involves a diverse team of artists, writers, visual effects professionals, technicians, and countless other specialists. Each member brings their unique expertise to the table, collaborating to transform a simple idea into a captivating cinematic experience. From the initial spark of a story to the final cut, every step requires creativity…

Source

]]>
Sama Bali <![CDATA[A Guide to Retrieval-Augmented Generation for AEC]]> http://www.open-lab.net/blog/?p=94305 2024-12-18T17:58:35Z 2024-12-18T21:00:00Z Large language models (LLMs) are rapidly changing the business landscape, offering new capabilities in natural language processing (NLP), content generation,...]]>

Large language models (LLMs) are rapidly changing the business landscape, offering new capabilities in natural language processing (NLP), content generation, and data analysis. These AI-powered tools have improved how companies operate, from streamlining customer service to enhancing decision-making processes. However, despite their impressive general knowledge, LLMs often struggle with…

Source

]]>
1
Rakib Hasan <![CDATA[NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference]]> http://www.open-lab.net/blog/?p=92963 2025-03-11T01:44:00Z 2024-12-18T17:31:01Z Recurrent drafting (referred to as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...]]>

Recurrent drafting (referred to as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM) inference now available with NVIDIA TensorRT-LLM. ReDrafter helps developers significantly boost LLM workload performance on NVIDIA GPUs. NVIDIA TensorRT-LLM is a library for optimizing LLM inference. It provides an easy-to-use Python API to…

Source

]]>
Anna Shors <![CDATA[Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner]]> http://www.open-lab.net/blog/?p=94082 2024-12-18T01:43:12Z 2024-12-18T01:43:09Z Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact,...]]>

Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact, easily deployable student with comparable accuracy to the teacher. Knowledge distillation has gained popularity in pretraining settings, but there are fewer resources available for performing knowledge distillation during supervised fine-tuning…

Source

]]>
Ike Nnoli <![CDATA[Deploy Agents, Assistants, and Avatars on NVIDIA RTX AI PCs with New Small Language Models]]> http://www.open-lab.net/blog/?p=92896 2024-12-17T03:32:15Z 2024-12-17T18:00:00Z NVIDIA just announced a series of small language models (SLMs) that increase the amount and type of information digital humans can use to augment their...]]>

NVIDIA just announced a series of small language models (SLMs) that increase the amount and type of information digital humans can use to augment their responses. This includes new large-context models that provide more relevant answers and new multi-modal models that allow images as inputs. These models are available now as part of NVIDIA ACE, a suite of digital human technologies that brings…

Source

]]>
Japinder Singh <![CDATA[Fine-Tuning Small Language Models to Optimize Code Review Accuracy]]> http://www.open-lab.net/blog/?p=94078 2025-02-17T05:13:45Z 2024-12-17T17:58:31Z Generative AI is transforming enterprises by driving innovation and boosting efficiency across numerous applications. However, adopting large foundational...]]>

Source

]]>
Anjali Shah <![CDATA[Boost Llama 3.3 70B Inference Throughput 3x with NVIDIA TensorRT-LLM Speculative Decoding]]> http://www.open-lab.net/blog/?p=94146 2024-12-19T23:03:40Z 2024-12-17T17:00:00Z Meta's Llama collection of open large language models (LLMs) continues to grow with the recent addition of Llama 3.3 70B, a text-only...]]>

Meta’s Llama collection of open large language models (LLMs) continues to grow with the recent addition of Llama 3.3 70B, a text-only instruction-tuned model. Llama 3.3 provides enhanced performance respective to the older Llama 3.1 70B model and can even match the capabilities of the larger, more computationally expensive Llama 3.1 405B model on several tasks including math, reasoning, coding…

Source

]]>
2
Ronay AK <![CDATA[Develop Multilingual and Cross-Lingual Information Retrieval Systems with Efficient Data Storage]]> http://www.open-lab.net/blog/?p=93638 2024-12-17T20:42:28Z 2024-12-17T16:00:00Z Efficient text retrieval is critical for a broad range of information retrieval applications such as search, question answering, semantic textual similarity,...]]>

Efficient text retrieval is critical for a broad range of information retrieval applications such as search, question answering, semantic textual similarity, summarization, and item recommendation. It also plays a pivotal role in retrieval-augmented generation (RAG), a technique that enables large language models (LLMs) to access external context without modifying underlying parameters.

Source

]]>
Suhas Hariharapura Sheshadri https://www.linkedin.com/in/suhassheshadri/ <![CDATA[NVIDIA Jetson Orin Nano Developer Kit Gets a ��Super�� Boost]]> http://www.open-lab.net/blog/?p=93942 2024-12-20T02:17:32Z 2024-12-17T14:00:00Z The generative AI landscape is rapidly evolving, with new large language models (LLMs), visual language models (VLMs), and vision language action (VLA) models...]]>

The generative AI landscape is rapidly evolving, with new large language models (LLMs), visual language models (VLMs), and vision language action (VLA) models emerging daily. To stay at the forefront of this transformative era, developers need a platform powerful enough to seamlessly deploy the latest models from the cloud to the edge with optimized inferencing and open ML frameworks using CUDA.

Source

]]>
1
Joseph Lucas <![CDATA[Sandboxing Agentic AI Workflows with WebAssembly]]> http://www.open-lab.net/blog/?p=93975 2024-12-16T21:06:56Z 2024-12-16T20:33:46Z Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this...]]>

Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this code should be sanitized and executed in a safe environment to mitigate risks from prompt injection and errors in the returned code. Sanitizing Python with regular expressions and restricted runtimes is insufficient…

Source

]]>
���˳���97caoporen����