AI Foundation Models – NVIDIA Technical Blog

AI Foundation Models – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-04-29T22:44:15Z http://www.open-lab.net/blog/feed/ Asawaree Bhide <![CDATA[R2D2: Advancing Robot Mobility and Whole-Body Control with Novel Workflows and AI Foundation Models from NVIDIA Research]]> http://www.open-lab.net/blog/?p=98193 2025-04-03T18:45:30Z 2025-03-27T15:00:00Z

Welcome to the first edition of the NVIDIA Robotics Research and Development Digest (R2D2). This technical blog series will give developers and researchers...

]]>

0 Kalyan Meher Vadrevu <![CDATA[Accelerate Generalist Humanoid Robot Development with NVIDIA Isaac GR00T N1]]> http://www.open-lab.net/blog/?p=97016 2025-03-31T20:48:04Z 2025-03-18T17:40:00Z

Humanoid robots are designed to adapt to human workspaces, tackling repetitive or demanding tasks. However, creating general-purpose humanoid robots for...

]]>

0 Anu Srivastava <![CDATA[Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance]]> http://www.open-lab.net/blog/?p=96770 2025-04-23T00:33:31Z 2025-03-12T08:45:00Z

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...

]]>

0 Anu Srivastava <![CDATA[Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs]]> http://www.open-lab.net/blog/?p=96519 2025-04-23T02:39:30Z 2025-02-26T22:05:00Z

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...

]]>

0 Emily Potyraj <![CDATA[NVIDIA DGX Cloud Introduces Ready-To-Use Templates to Benchmark AI Platform Performance]]> http://www.open-lab.net/blog/?p=95558 2025-04-23T02:52:54Z 2025-02-11T17:00:00Z

In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a...

]]>

0 Chintan Patel <![CDATA[Llama Nemotron Models Accelerate Agentic AI Workflows with Accuracy and Efficiency]]> http://www.open-lab.net/blog/?p=94595 2025-01-09T19:23:09Z 2025-01-07T03:40:00Z

Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and...

]]>

0 Rakib Hasan <![CDATA[NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference]]> http://www.open-lab.net/blog/?p=92963 2025-03-11T01:44:00Z 2024-12-18T17:31:01Z

Recurrent drafting (referred to as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...

]]>

0 Anna Shors <![CDATA[Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner]]> http://www.open-lab.net/blog/?p=94082 2024-12-18T01:43:12Z 2024-12-18T01:43:09Z

Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact,...

]]>

0 Bethann Noble <![CDATA[Deploying Fine-Tuned AI Models with NVIDIA NIM]]> http://www.open-lab.net/blog/?p=91696 2024-12-17T00:07:21Z 2024-11-21T22:04:57Z

For organizations adapting AI foundation models with domain-specific data, the ability to rapidly create and deploy fine-tuned models is key to efficiently...

]]>

0 David A. Smith <![CDATA[Spotlight: Advancing Autonomous Operations with AVEVA Dynamic Simulation and NVIDIA Raptor]]> http://www.open-lab.net/blog/?p=92360 2024-12-20T18:37:24Z 2024-11-21T17:28:26Z

Industrial engineers are turning to AI to build advanced process simulation solutions and accelerate progress toward fully autonomous operations in the energy,...

]]>

0 Ashraf Eassa <![CDATA[Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs]]> http://www.open-lab.net/blog/?p=90142 2024-11-22T23:11:53Z 2024-11-19T16:00:00Z

Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...

]]>

0 Maryam Ashoori <![CDATA[IBM��s New Granite 3.0 Generative AI Models Are Small, Yet Highly Accurate and Efficient]]> http://www.open-lab.net/blog/?p=90636 2024-11-22T23:09:36Z 2024-10-21T19:15:35Z

Today, IBM released the third generation of IBM Granite, a collection of open language models and complementary tools. Prior generations of Granite focused on...

]]>

0 Chintan Patel <![CDATA[Develop Academic and Industrial Applications with a New Specialized Math Model]]> http://www.open-lab.net/blog/?p=89747 2024-10-17T18:19:06Z 2024-10-09T16:00:00Z

Mathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science.

]]>

0 Nick Comly <![CDATA[Boosting Llama 3.1 405B Throughput by Another 1.5x on NVIDIA H200 Tensor Core GPUs and NVLink Switch]]> http://www.open-lab.net/blog/?p=90040 2024-11-22T23:12:12Z 2024-10-09T15:00:00Z

The continued growth of LLMs capability, fueled by increasing parameter counts and support for longer contexts, has led to their usage in a wide variety of...

]]>

1 Sharath Sreenivas <![CDATA[Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy]]> http://www.open-lab.net/blog/?p=87739 2024-10-17T18:51:42Z 2024-10-08T19:20:54Z

This post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading...

]]>

0 Jen Witsoe <![CDATA[Just Released: NVIDIA TensorRT-LLM 0.13.0]]> http://www.open-lab.net/blog/?p=89751 2024-10-17T19:06:58Z 2024-10-04T21:45:36Z

Updates include tensor parallel support for Mamba2, sparse mixer normalization for MoE models, and more.

]]>

0 Zhilin Wang <![CDATA[New Reward Model Helps Improve LLM Alignment with Human Preferences]]> http://www.open-lab.net/blog/?p=89655 2024-10-21T23:56:04Z 2024-10-03T16:00:00Z

Reinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the...

]]>

0 Annamalai Chockalingam <![CDATA[Accelerating LLMs with llama.cpp on NVIDIA RTX Systems]]> http://www.open-lab.net/blog/?p=89663 2024-11-22T23:11:17Z 2024-10-02T13:00:00Z

The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...

]]>

0 Chintan Patel <![CDATA[Improve Reinforcement Learning from Human Feedback with Leaderboard-Topping Reward Model]]> http://www.open-lab.net/blog/?p=89583 2024-11-04T22:57:33Z 2024-09-30T19:21:18Z

Llama 3.1 Nemotron 70B Reward model helps generate high-quality training data that aligns with human preferences for finance, retail, healthcare, scientific...

]]>

0 Anjali Shah <![CDATA[Deploying Accelerated Llama 3.2 from the Edge to the Cloud]]> http://www.open-lab.net/blog/?p=89436 2024-11-07T05:08:12Z 2024-09-25T18:39:49Z

Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an...

]]>

0 Chintan Patel <![CDATA[Generate code with Abacus AI��s Dracarys Large Language Model]]> http://www.open-lab.net/blog/?p=89091 2024-09-17T00:50:07Z 2024-09-17T00:50:04Z

Dracarys, fine-tuned from Llama 3.1 70B and available from NVIDIA NIM microservice, supports a variety of applications, including data analysis, text...

]]>

0 Ashraf Eassa <![CDATA[Low Latency Inference Chapter 1: Up to 1.9x Higher Llama 3.1 Performance with Medusa on NVIDIA HGX H200 with NVLink Switch]]> http://www.open-lab.net/blog/?p=88127 2024-11-29T21:06:37Z 2024-09-05T18:30:00Z

As large language models (LLMs) continue to grow in size and complexity, multi-GPU compute is a must-have to deliver the low latency and high throughput that...

]]>

0 Chintan Patel <![CDATA[New NIM Available: Mistral Large 2 Instruct LLM]]> http://www.open-lab.net/blog/?p=87308 2024-08-22T18:24:59Z 2024-08-13T20:37:24Z

The new model by Mistral excels at a variety of complex tasks including text summarization, multilingual translation and reasoning, programming, question and...

]]>

0 Amulya Vishwanath <![CDATA[Fast-Track Robot Learning in Simulation Using NVIDIA Isaac Lab]]> http://www.open-lab.net/blog/?p=86103 2024-08-08T19:23:57Z 2024-07-29T20:30:00Z

Originally published on July 29, 2024, this post was updated on October 8, 2024. Robots need to be adaptable, readily learning new skills and adjusting to their...

]]>

0 Anjali Shah <![CDATA[Power Text-Generation Applications with Mistral NeMo 12B Running on a Single GPU]]> http://www.open-lab.net/blog/?p=86123 2024-08-28T15:32:33Z 2024-07-26T21:03:15Z

NVIDIA collaborated with Mistral to co-build the next-generation language model that achieves leading performance across benchmarks in its class. With a growing...

]]>

3 Chintan Patel <![CDATA[Revolutionizing Code Completion with Codestral Mamba, the Next-Gen Coding LLM]]> http://www.open-lab.net/blog/?p=85101 2024-08-08T18:48:30Z 2024-07-25T19:57:14Z

In the rapidly evolving field of generative AI, coding models have become indispensable tools for developers, enhancing productivity and precision in software...

]]>

0 Anjali Shah <![CDATA[Supercharging Llama 3.1 across NVIDIA Platforms]]> http://www.open-lab.net/blog/?p=85678 2025-02-17T05:23:06Z 2024-07-23T15:15:00Z

Meta's Llama collection of large language models are the most popular foundation models in the open-source community today, supporting a variety of use cases....

]]>

13 Chintan Shah <![CDATA[Phi-3-Medium: Now Available on the NVIDIA API Catalog]]> http://www.open-lab.net/blog/?p=84759 2024-07-25T18:19:18Z 2024-07-02T16:42:36Z

Phi-3-Medium accelerates research with logic-rich features in both short (4K) and long (128K) context.

]]>

0 Chintan Patel <![CDATA[StarCoder2-15B: A Powerful LLM for Code Generation, Summarization, and Documentation]]> http://www.open-lab.net/blog/?p=84790 2024-07-25T18:19:19Z 2024-07-01T22:40:37Z

Trained on 600+ programming languages, StarCoder2-15B is now packaged as a NIM inference microservice available for free from the NVIDIA API catalog.

]]>

0 Hannah Simmons <![CDATA[Google��s New Gemma 2 Model Now Optimized and Available on NVIDIA API Catalog]]> http://www.open-lab.net/blog/?p=84688 2024-07-25T18:19:20Z 2024-07-01T16:00:00Z

Gemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice.

]]>

1 Guilherme Pombo <![CDATA[Transforming Financial Analysis with NVIDIA NIM]]> http://www.open-lab.net/blog/?p=84655 2024-08-28T16:46:12Z 2024-06-28T22:07:03Z

In financial services, portfolio managers and research analysts diligently sift through vast amounts of data to gain a competitive edge in investments. Making...

]]>

0 Hannah Simmons <![CDATA[Generate High-Quality, Context-Aware Responses for Chatbots and Search Engines with Llama 3-ChatQA]]> http://www.open-lab.net/blog/?p=84548 2024-07-10T15:28:34Z 2024-06-26T16:44:52Z

Experience and test Llama3-ChatQA models at scale with performance optimized NVIDIA NIM inference microservice using the NVIDIA API catalog.

]]>

0 Pengfei Guo <![CDATA[Addressing Medical Imaging Limitations with Synthetic Data Generation]]> http://www.open-lab.net/blog/?p=83468 2025-02-04T19:51:06Z 2024-06-24T17:50:59Z

Synthetic data in medical imaging offers numerous benefits, including the ability to augment datasets with diverse and realistic images where real data is...

]]>

0 Hannah Simmons <![CDATA[Simplify and Accelerate Programming Tasks with Mistral��s Codestral GenAI Model]]> http://www.open-lab.net/blog/?p=84259 2024-06-27T18:17:58Z 2024-06-17T22:28:32Z

Experience Codestral, packaged as an NVIDIA NIM inference microservice for code completion, writing tests, and debugging in over 80 languages using the NVIDIA...

]]>

0 Hannah Simmons <![CDATA[SOLAR-10.7B: Optimized Model Tailored Instruction Following, Reasoning, and Mathematical Tasks]]> http://www.open-lab.net/blog/?p=83828 2024-06-13T19:05:58Z 2024-06-10T15:00:00Z

Enhance efficiency and performance in instruction-based NLP tasks with SOLAR-10.7B, especially in following instructions, reasoning, and mathematical tasks.

]]>

0 Hannah Simmons <![CDATA[Breeze-7B: LLM Specialized for Traditional Chinese]]> http://www.open-lab.net/blog/?p=83334 2024-06-13T19:06:04Z 2024-06-03T17:00:00Z

The model demonstrates strong performance for tasks such as Q&A, multi-round chat, and summarization in both traditional Chinese and English.

]]>

0 Hannah Simmons <![CDATA[BGE-M3: Advanced Multilingual Text Retrieval Model]]> http://www.open-lab.net/blog/?p=83341 2024-06-13T19:06:03Z 2024-06-03T17:00:00Z

Experience the versatile embedding model designed for multilingual, multi-functional, and multi-granularity text retrieval tasks, excelling in dense,...

]]>

1 Hannah Simmons <![CDATA[Convert Natural Language to Code with CodeGemma]]> http://www.open-lab.net/blog/?p=83003 2024-06-13T19:11:39Z 2024-05-30T20:30:00Z

Experience the advanced LLM API for code generation, completion, mathematical reasoning, and instruction following with free cloud credits.

]]>

0 Nisanur Genc <![CDATA[Personalized Learning with Gipi, NVIDIA TensortRT-LLM, and AI Foundation Models]]> http://www.open-lab.net/blog/?p=82913 2024-05-30T19:55:44Z 2024-05-30T16:00:00Z

Over 1.2B people are actively learning new languages, with over 500M learners on digital learning platforms such as Duolingo. At the same time, a significant...

]]>

0 Chintan Patel <![CDATA[Create Content, Conversations, and Code with New Phi-3 and Granite Code Model Families]]> http://www.open-lab.net/blog/?p=82907 2024-05-30T19:55:45Z 2024-05-28T20:00:00Z

Generative AI is revolutionizing virtually every use case across every industry, thanks to the constant influx of groundbreaking foundation models capable of...

]]>

0 Hannah Simmons <![CDATA[Generate Text Responses from Visual and Text Inputs with Google��s New PaliGemma Model]]> http://www.open-lab.net/blog/?p=82533 2024-06-07T21:15:13Z 2024-05-14T18:46:00Z

With free NVIDIA cloud credits, you can start testing the model at scale on the API Catalog.

]]>

0 Chintan Patel <![CDATA[Regional LLMs SEA-LION and SeaLLM Serve Languages and Cultures of Southeast Asia]]> http://www.open-lab.net/blog/?p=82014 2024-05-30T19:55:59Z 2024-05-13T17:00:00Z

At the recent World Governments Summit in Dubai, NVIDIA CEO Jensen Huang emphasized the importance of sovereign AI, which refers to a nation��s capability to...

]]>

0 Amit Bleiweiss <![CDATA[Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints]]> http://www.open-lab.net/blog/?p=81895 2025-03-11T16:19:32Z 2024-05-08T16:00:00Z

Retrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more...

]]>

7 Chintan Patel <![CDATA[Leverage Mixture of Experts-Based DBRX for Superior LLM Performance on Diverse Tasks]]> http://www.open-lab.net/blog/?p=81586 2024-06-07T21:13:14Z 2024-04-30T17:21:35Z

This week��s model release features DBRX, a state-of-the-art large language model (LLM) developed by Databricks. With demonstrated strength in programming and...

]]>

0 Anjali Shah <![CDATA[Turbocharging Meta Llama 3 Performance with NVIDIA TensorRT-LLM and NVIDIA Triton Inference Server]]> http://www.open-lab.net/blog/?p=81223 2024-11-14T15:54:32Z 2024-04-28T18:07:15Z

We're excited to announce support for the Meta Llama 3 family of models in NVIDIA TensorRT-LLM, accelerating and optimizing your LLM inference performance. You...

]]>

61 Chintan Patel <![CDATA[New LLM: Snowflake Arctic Model for SQL and Code Generation]]> http://www.open-lab.net/blog/?p=81484 2024-05-07T16:53:04Z 2024-04-27T00:42:50Z

Large language models (LLMs) have revolutionized natural language processing (NLP) in recent years, enabling a wide range of applications such as text...

]]>

0 Vishwesh Nath <![CDATA[Advancing Cell Segmentation and Morphology Analysis with NVIDIA AI Foundation Model VISTA-2D]]> http://www.open-lab.net/blog/?p=81250 2024-05-07T16:54:01Z 2024-04-22T18:30:00Z

Genomics researchers use different sequencing techniques to better understand biological systems, including single-cell and spatial omics. Unlike single-cell,...

]]>

0 Chintan Patel <![CDATA[Mistral Large and Mixtral 8x22B LLMs Now Powered by NVIDIA NIM and NVIDIA API]]> http://www.open-lab.net/blog/?p=80850 2024-06-06T14:50:14Z 2024-04-22T17:00:00Z

This week��s model release features two new NVIDIA AI Foundation models, Mistral Large and Mixtral 8x22B, both developed by Mistral AI. These cutting-edge...

]]>

1 Erin Rapacki <![CDATA[Scale AI-Enabled Robotics Development Workloads with NVIDIA OSMO]]> http://www.open-lab.net/blog/?p=79317 2024-05-07T16:51:46Z 2024-03-18T23:00:00Z

Autonomous machine development is an iterative process of data generation and gathering, model training, and deployment characterized by complex multi-stage,...

]]>

0 Kyle Kranen <![CDATA[Applying Mixture of Experts in LLM Architectures]]> http://www.open-lab.net/blog/?p=79605 2024-06-06T14:53:24Z 2024-03-14T20:01:00Z

Mixture of experts (MoE) large language model (LLM) architectures have recently emerged, both in proprietary LLMs such as GPT-4, as well as in community models...

]]>

0 Amr Elmeleegy <![CDATA[Generate Stunning Images with Stable Diffusion XL on the NVIDIA AI Inference Platform]]> http://www.open-lab.net/blog/?p=78388 2025-03-18T18:31:44Z 2024-03-07T19:05:46Z

Diffusion models are transforming creative workflows across industries. These models generate stunning images based on simple text or image inputs by...

]]>

1 Chintan Patel <![CDATA[Solve Complex AI Tasks with Leaderboard-Topping Smaug 72B from NVIDIA AI Foundation Models]]> http://www.open-lab.net/blog/?p=78769 2024-05-07T16:50:32Z 2024-03-04T21:22:47Z

This week��s model release features the NVIDIA-optimized language model Smaug 72B, which you can experience directly from your browser. NVIDIA AI Foundation...

]]>

0 Chia-Chih Chen <![CDATA[Unlock Your LLM Coding Potential with StarCoder2]]> http://www.open-lab.net/blog/?p=78552 2024-03-07T19:32:10Z 2024-02-28T14:00:00Z

Coding is essential in the digital age, but it can also be tedious and time-consuming. That's why many developers are looking for ways to automate and...

]]>

0 Chintan Patel <![CDATA[Unlock the Power of Small Language Model Phi-2 for Chat, Research, Coding, and More]]> http://www.open-lab.net/blog/?p=78402 2024-06-06T14:55:12Z 2024-02-27T18:00:39Z

This week��s model release features the NVIDIA-optimized language model Phi-2, which can be used for a wide range of natural language processing (NLP) tasks....

]]>

0 Moon Chung <![CDATA[Experience NVIDIA cuOpt Accelerated Optimization to Boost Operational Efficiency]]> http://www.open-lab.net/blog/?p=77674 2024-09-16T16:25:19Z 2024-02-19T19:30:00Z

This week��s model release features NVIDIA cuOpt, a world-record-breaking accelerated optimization engine that helps teams solve complex routing problems and...

]]>

0 Chintan Patel <![CDATA[Performance-Efficient Mamba-Chat from NVIDIA AI Foundation Models]]> http://www.open-lab.net/blog/?p=77766 2024-05-07T16:50:07Z 2024-02-12T21:24:04Z

This week��s release features the NVIDIA-optimized Mamba-Chat model, which you can experience directly from your browser. This post is part of Model Mondays, a...

]]>

0 Chintan Patel <![CDATA[Generate Code, Answer Queries, and Translate Text with New NVIDIA AI Foundation Models]]> http://www.open-lab.net/blog/?p=77364 2024-05-07T19:14:10Z 2024-02-05T18:48:17Z

This week��s Model Monday release features the NVIDIA-optimized code Llama, Kosmos-2, and SeamlessM4T, which you can experience directly from your browser....

]]>

0 Shashank Verma <![CDATA[Query Graphs with Optimized DePlot Model]]> http://www.open-lab.net/blog/?p=77003 2024-05-07T16:48:52Z 2024-01-23T00:34:34Z

NVIDIA AI Foundation Models and Endpoints provides access to a curated set of community and NVIDIA-built generative AI models to experience, customize, and...

]]>

0 ��˳��97caoporen��