AI Foundation Models – NVIDIA Technical BlogNews and tutorials for developers, data scientists, and IT admins2025-04-29T22:44:15Zhttp://www.open-lab.net/blog/feed/Asawaree Bhide<![CDATA[R2D2: Advancing Robot Mobility and Whole-Body Control with Novel Workflows and AI Foundation Models from NVIDIA Research]]>http://www.open-lab.net/blog/?p=981932025-04-03T18:45:30Z2025-03-27T15:00:00ZWelcome to the first edition of the NVIDIA Robotics Research and Development Digest (R2D2). This technical blog series will give developers and researchers...
]]>0Kalyan Meher Vadrevu<![CDATA[Accelerate Generalist Humanoid Robot Development with NVIDIA Isaac GR00T N1]]>http://www.open-lab.net/blog/?p=970162025-03-31T20:48:04Z2025-03-18T17:40:00ZHumanoid robots are designed to adapt to human workspaces, tackling repetitive or demanding tasks. However, creating general-purpose humanoid robots for...
]]>0Anu Srivastava<![CDATA[Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance]]>http://www.open-lab.net/blog/?p=967702025-04-23T00:33:31Z2025-03-12T08:45:00ZBuilding AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...
]]>0Anu Srivastava<![CDATA[Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs]]>http://www.open-lab.net/blog/?p=965192025-04-23T02:39:30Z2025-02-26T22:05:00ZLarge language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...
]]>0Emily Potyraj<![CDATA[NVIDIA DGX Cloud Introduces Ready-To-Use Templates to Benchmark AI Platform Performance]]>http://www.open-lab.net/blog/?p=955582025-04-23T02:52:54Z2025-02-11T17:00:00ZIn the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a...
]]>0Chintan Patel<![CDATA[Llama Nemotron Models Accelerate Agentic AI Workflows with Accuracy and Efficiency]]>http://www.open-lab.net/blog/?p=945952025-01-09T19:23:09Z2025-01-07T03:40:00ZAgentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and...
]]>0Rakib Hasan<![CDATA[NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference]]>http://www.open-lab.net/blog/?p=929632025-03-11T01:44:00Z2024-12-18T17:31:01ZRecurrent drafting (referred to as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...
]]>0Anna Shors<![CDATA[Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner]]>http://www.open-lab.net/blog/?p=940822024-12-18T01:43:12Z2024-12-18T01:43:09ZKnowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact,...
]]>0Bethann Noble<![CDATA[Deploying Fine-Tuned AI Models with NVIDIA NIM]]>http://www.open-lab.net/blog/?p=916962024-12-17T00:07:21Z2024-11-21T22:04:57ZFor organizations adapting AI foundation models with domain-specific data, the ability to rapidly create and deploy fine-tuned models is key to efficiently...
]]>0David A. Smith<![CDATA[Spotlight: Advancing Autonomous Operations with AVEVA Dynamic Simulation and NVIDIA Raptor]]>http://www.open-lab.net/blog/?p=923602024-12-20T18:37:24Z2024-11-21T17:28:26ZIndustrial engineers are turning to AI to build advanced process simulation solutions and accelerate progress toward fully autonomous operations in the energy,...
]]>0Ashraf Eassa<![CDATA[Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs]]>http://www.open-lab.net/blog/?p=901422024-11-22T23:11:53Z2024-11-19T16:00:00ZMeta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...
]]>0Maryam Ashoori<![CDATA[IBM��s New Granite 3.0 Generative AI Models Are Small, Yet Highly Accurate and Efficient]]>http://www.open-lab.net/blog/?p=906362024-11-22T23:09:36Z2024-10-21T19:15:35ZToday, IBM released the third generation of IBM Granite, a collection of open language models and complementary tools. Prior generations of Granite focused on...
]]>0Chintan Patel<![CDATA[Develop Academic and Industrial Applications with a New Specialized Math Model]]>http://www.open-lab.net/blog/?p=897472024-10-17T18:19:06Z2024-10-09T16:00:00ZMathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science.
]]>0Nick Comly<![CDATA[Boosting Llama 3.1 405B Throughput by Another 1.5x on NVIDIA H200 Tensor Core GPUs and NVLink Switch]]>http://www.open-lab.net/blog/?p=900402024-11-22T23:12:12Z2024-10-09T15:00:00ZThe continued growth of LLMs capability, fueled by increasing parameter counts and support for longer contexts, has led to their usage in a wide variety of...
]]>1Sharath Sreenivas<![CDATA[Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy]]>http://www.open-lab.net/blog/?p=877392024-10-17T18:51:42Z2024-10-08T19:20:54ZThis post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading...
]]>0Jen Witsoe<![CDATA[Just Released: NVIDIA TensorRT-LLM 0.13.0]]>http://www.open-lab.net/blog/?p=897512024-10-17T19:06:58Z2024-10-04T21:45:36ZUpdates include tensor parallel support for Mamba2, sparse mixer normalization for MoE models, and more.
]]>0Zhilin Wang<![CDATA[New Reward Model Helps Improve LLM Alignment with Human Preferences]]>http://www.open-lab.net/blog/?p=896552024-10-21T23:56:04Z2024-10-03T16:00:00ZReinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the...
]]>0Annamalai Chockalingam<![CDATA[Accelerating LLMs with llama.cpp on NVIDIA RTX Systems]]>http://www.open-lab.net/blog/?p=896632024-11-22T23:11:17Z2024-10-02T13:00:00ZThe NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...
]]>0Chintan Patel<![CDATA[Improve Reinforcement Learning from Human Feedback with Leaderboard-Topping Reward Model]]>http://www.open-lab.net/blog/?p=895832024-11-04T22:57:33Z2024-09-30T19:21:18ZLlama 3.1 Nemotron 70B Reward model helps generate high-quality training data that aligns with human preferences for finance, retail, healthcare, scientific...
]]>0Anjali Shah<![CDATA[Deploying Accelerated Llama 3.2 from the Edge to the Cloud]]>http://www.open-lab.net/blog/?p=894362024-11-07T05:08:12Z2024-09-25T18:39:49ZExpanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an...
]]>0Chintan Patel<![CDATA[Generate code with Abacus AI��s Dracarys Large Language Model]]>http://www.open-lab.net/blog/?p=890912024-09-17T00:50:07Z2024-09-17T00:50:04ZDracarys, fine-tuned from Llama 3.1 70B and available from NVIDIA NIM microservice, supports a variety of applications, including data analysis, text...
]]>0Ashraf Eassa<![CDATA[Low Latency Inference Chapter 1: Up to 1.9x Higher Llama 3.1 Performance with Medusa on NVIDIA HGX H200 with NVLink Switch]]>http://www.open-lab.net/blog/?p=881272024-11-29T21:06:37Z2024-09-05T18:30:00ZAs large language models (LLMs) continue to grow in size and complexity, multi-GPU compute is a must-have to deliver the low latency and high throughput that...
]]>0Chintan Patel<![CDATA[New NIM Available: Mistral Large 2 Instruct LLM]]>http://www.open-lab.net/blog/?p=873082024-08-22T18:24:59Z2024-08-13T20:37:24ZThe new model by Mistral excels at a variety of complex tasks including text summarization, multilingual translation and reasoning, programming, question and...
]]>0Amulya Vishwanath<![CDATA[Fast-Track Robot Learning in Simulation Using NVIDIA Isaac Lab]]>http://www.open-lab.net/blog/?p=861032024-08-08T19:23:57Z2024-07-29T20:30:00ZOriginally published on July 29, 2024, this post was updated on October 8, 2024. Robots need to be adaptable, readily learning new skills and adjusting to their...
]]>0Anjali Shah<![CDATA[Power Text-Generation Applications with Mistral NeMo 12B Running on a Single GPU]]>http://www.open-lab.net/blog/?p=861232024-08-28T15:32:33Z2024-07-26T21:03:15ZNVIDIA collaborated with Mistral to co-build the next-generation language model that achieves leading performance across benchmarks in its class. With a growing...
]]>3Chintan Patel<![CDATA[Revolutionizing Code Completion with Codestral Mamba, the Next-Gen Coding LLM]]>http://www.open-lab.net/blog/?p=851012024-08-08T18:48:30Z2024-07-25T19:57:14ZIn the rapidly evolving field of generative AI, coding models have become indispensable tools for developers, enhancing productivity and precision in software...
]]>0Anjali Shah<![CDATA[Supercharging Llama 3.1 across NVIDIA Platforms]]>http://www.open-lab.net/blog/?p=856782025-02-17T05:23:06Z2024-07-23T15:15:00ZMeta's Llama collection of large language models are the most popular foundation models in the open-source community today, supporting a variety of use cases....
]]>13Chintan Shah<![CDATA[Phi-3-Medium: Now Available on the NVIDIA API Catalog]]>http://www.open-lab.net/blog/?p=847592024-07-25T18:19:18Z2024-07-02T16:42:36ZPhi-3-Medium accelerates research with logic-rich features in both short (4K) and long (128K) context.
]]>0Chintan Patel<![CDATA[StarCoder2-15B: A Powerful LLM for Code Generation, Summarization, and Documentation]]>http://www.open-lab.net/blog/?p=847902024-07-25T18:19:19Z2024-07-01T22:40:37ZTrained on 600+ programming languages, StarCoder2-15B is now packaged as a NIM inference microservice available for free from the NVIDIA API catalog.
]]>0Hannah Simmons<![CDATA[Google��s New Gemma 2 Model Now Optimized and Available on NVIDIA API Catalog]]>http://www.open-lab.net/blog/?p=846882024-07-25T18:19:20Z2024-07-01T16:00:00ZGemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice.
]]>1Guilherme Pombo<![CDATA[Transforming Financial Analysis with NVIDIA NIM]]>http://www.open-lab.net/blog/?p=846552024-08-28T16:46:12Z2024-06-28T22:07:03ZIn financial services, portfolio managers and research analysts diligently sift through vast amounts of data to gain a competitive edge in investments. Making...
]]>0Hannah Simmons<![CDATA[Generate High-Quality, Context-Aware Responses for Chatbots and Search Engines with Llama 3-ChatQA]]>http://www.open-lab.net/blog/?p=845482024-07-10T15:28:34Z2024-06-26T16:44:52ZExperience and test Llama3-ChatQA models at scale with performance optimized NVIDIA NIM inference microservice using the NVIDIA API catalog.
]]>0Pengfei Guo<![CDATA[Addressing Medical Imaging Limitations with Synthetic Data Generation]]>http://www.open-lab.net/blog/?p=834682025-02-04T19:51:06Z2024-06-24T17:50:59ZSynthetic data in medical imaging offers numerous benefits, including the ability to augment datasets with diverse and realistic images where real data is...
]]>0Hannah Simmons<![CDATA[Simplify and Accelerate Programming Tasks with Mistral��s Codestral GenAI Model]]>http://www.open-lab.net/blog/?p=842592024-06-27T18:17:58Z2024-06-17T22:28:32ZExperience Codestral, packaged as an NVIDIA NIM inference microservice for code completion, writing tests, and debugging in over 80 languages using the NVIDIA...
]]>0Hannah Simmons<![CDATA[SOLAR-10.7B: Optimized Model Tailored Instruction Following, Reasoning, and Mathematical Tasks]]>http://www.open-lab.net/blog/?p=838282024-06-13T19:05:58Z2024-06-10T15:00:00ZEnhance efficiency and performance in instruction-based NLP tasks with SOLAR-10.7B, especially in following instructions, reasoning, and mathematical tasks.
]]>0Hannah Simmons<![CDATA[Breeze-7B: LLM Specialized for Traditional Chinese]]>http://www.open-lab.net/blog/?p=833342024-06-13T19:06:04Z2024-06-03T17:00:00ZThe model demonstrates strong performance for tasks such as Q&A, multi-round chat, and summarization in both traditional Chinese and English.
]]>0Hannah Simmons<![CDATA[BGE-M3: Advanced Multilingual Text Retrieval Model]]>http://www.open-lab.net/blog/?p=833412024-06-13T19:06:03Z2024-06-03T17:00:00ZExperience the versatile embedding model designed for multilingual, multi-functional, and multi-granularity text retrieval tasks, excelling in dense,...
]]>1Hannah Simmons<![CDATA[Convert Natural Language to Code with CodeGemma]]>http://www.open-lab.net/blog/?p=830032024-06-13T19:11:39Z2024-05-30T20:30:00ZExperience the advanced LLM API for code generation, completion, mathematical reasoning, and instruction following with free cloud credits.
]]>0Nisanur Genc<![CDATA[Personalized Learning with Gipi, NVIDIA TensortRT-LLM, and AI Foundation Models]]>http://www.open-lab.net/blog/?p=829132024-05-30T19:55:44Z2024-05-30T16:00:00ZOver 1.2B people are actively learning new languages, with over 500M learners on digital learning platforms such as Duolingo. At the same time, a significant...
]]>0Chintan Patel<![CDATA[Create Content, Conversations, and Code with New Phi-3 and Granite Code Model Families]]>http://www.open-lab.net/blog/?p=829072024-05-30T19:55:45Z2024-05-28T20:00:00ZGenerative AI is revolutionizing virtually every use case across every industry, thanks to the constant influx of groundbreaking foundation models capable of...
]]>0Hannah Simmons<![CDATA[Generate Text Responses from Visual and Text Inputs with Google��s New PaliGemma Model]]>http://www.open-lab.net/blog/?p=825332024-06-07T21:15:13Z2024-05-14T18:46:00ZWith free NVIDIA cloud credits, you can start testing the model at scale on the API Catalog.
]]>0Chintan Patel<![CDATA[Regional LLMs SEA-LION and SeaLLM Serve Languages and Cultures of Southeast Asia]]>http://www.open-lab.net/blog/?p=820142024-05-30T19:55:59Z2024-05-13T17:00:00ZAt the recent World Governments Summit in Dubai, NVIDIA CEO Jensen Huang emphasized the importance of sovereign AI, which refers to a nation��s capability to...
]]>0Amit Bleiweiss<![CDATA[Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints]]>http://www.open-lab.net/blog/?p=818952025-03-11T16:19:32Z2024-05-08T16:00:00ZRetrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more...
]]>7Chintan Patel<![CDATA[Leverage Mixture of Experts-Based DBRX for Superior LLM Performance on Diverse Tasks]]>http://www.open-lab.net/blog/?p=815862024-06-07T21:13:14Z2024-04-30T17:21:35ZThis week��s model release features DBRX, a state-of-the-art large language model (LLM) developed by Databricks. With demonstrated strength in programming and...
]]>0Anjali Shah<![CDATA[Turbocharging Meta Llama 3 Performance with NVIDIA TensorRT-LLM and NVIDIA Triton Inference Server]]>http://www.open-lab.net/blog/?p=812232024-11-14T15:54:32Z2024-04-28T18:07:15ZWe're excited to announce support for the Meta Llama 3 family of models in NVIDIA TensorRT-LLM, accelerating and optimizing your LLM inference performance. You...
]]>61Chintan Patel<![CDATA[New LLM: Snowflake Arctic Model for SQL and Code Generation]]>http://www.open-lab.net/blog/?p=814842024-05-07T16:53:04Z2024-04-27T00:42:50ZLarge language models (LLMs) have revolutionized natural language processing (NLP) in recent years, enabling a wide range of applications such as text...
]]>0Vishwesh Nath<![CDATA[Advancing Cell Segmentation and Morphology Analysis with NVIDIA AI Foundation Model VISTA-2D]]>http://www.open-lab.net/blog/?p=812502024-05-07T16:54:01Z2024-04-22T18:30:00ZGenomics researchers use different sequencing techniques to better understand biological systems, including single-cell and spatial omics. Unlike single-cell,...
]]>0Chintan Patel<![CDATA[Mistral Large and Mixtral 8x22B LLMs Now Powered by NVIDIA NIM and NVIDIA API]]>http://www.open-lab.net/blog/?p=808502024-06-06T14:50:14Z2024-04-22T17:00:00ZThis week��s model release features two new NVIDIA AI Foundation models, Mistral Large and Mixtral 8x22B, both developed by Mistral AI. These cutting-edge...
]]>1Erin Rapacki<![CDATA[Scale AI-Enabled Robotics Development Workloads with NVIDIA OSMO]]>http://www.open-lab.net/blog/?p=793172024-05-07T16:51:46Z2024-03-18T23:00:00ZAutonomous machine development is an iterative process of data generation and gathering, model training, and deployment characterized by complex multi-stage,...
]]>0Kyle Kranen<![CDATA[Applying Mixture of Experts in LLM Architectures]]>http://www.open-lab.net/blog/?p=796052024-06-06T14:53:24Z2024-03-14T20:01:00ZMixture of experts (MoE) large language model (LLM) architectures have recently emerged, both in proprietary LLMs such as GPT-4, as well as in community models...
]]>0Amr Elmeleegy<![CDATA[Generate Stunning Images with Stable Diffusion XL on the NVIDIA AI Inference Platform]]>http://www.open-lab.net/blog/?p=783882025-03-18T18:31:44Z2024-03-07T19:05:46ZDiffusion models are transforming creative workflows across industries. These models generate stunning images based on simple text or image inputs by...
]]>1Chintan Patel<![CDATA[Solve Complex AI Tasks with Leaderboard-Topping Smaug 72B from NVIDIA AI Foundation Models]]>http://www.open-lab.net/blog/?p=787692024-05-07T16:50:32Z2024-03-04T21:22:47ZThis week��s model release features the NVIDIA-optimized language model Smaug 72B, which you can experience directly from your browser. NVIDIA AI Foundation...
]]>0Chia-Chih Chen<![CDATA[Unlock Your LLM Coding Potential with StarCoder2]]>http://www.open-lab.net/blog/?p=785522024-03-07T19:32:10Z2024-02-28T14:00:00ZCoding is essential in the digital age, but it can also be tedious and time-consuming. That's why many developers are looking for ways to automate and...
]]>0Chintan Patel<![CDATA[Unlock the Power of Small Language Model Phi-2 for Chat, Research, Coding, and More]]>http://www.open-lab.net/blog/?p=784022024-06-06T14:55:12Z2024-02-27T18:00:39ZThis week��s model release features the NVIDIA-optimized language model Phi-2, which can be used for a wide range of natural language processing (NLP) tasks....
]]>0Moon Chung<![CDATA[Experience NVIDIA cuOpt Accelerated Optimization to Boost Operational Efficiency]]>http://www.open-lab.net/blog/?p=776742024-09-16T16:25:19Z2024-02-19T19:30:00ZThis week��s model release features NVIDIA cuOpt, a world-record-breaking accelerated optimization engine that helps teams solve complex routing problems and...
]]>0Chintan Patel<![CDATA[Performance-Efficient Mamba-Chat from NVIDIA AI Foundation Models]]>http://www.open-lab.net/blog/?p=777662024-05-07T16:50:07Z2024-02-12T21:24:04ZThis week��s release features the NVIDIA-optimized Mamba-Chat model, which you can experience directly from your browser. This post is part of Model Mondays, a...
]]>0Chintan Patel<![CDATA[Generate Code, Answer Queries, and Translate Text with New NVIDIA AI Foundation Models]]>http://www.open-lab.net/blog/?p=773642024-05-07T19:14:10Z2024-02-05T18:48:17ZThis week��s Model Monday release features the NVIDIA-optimized code Llama, Kosmos-2, and SeamlessM4T, which you can experience directly from your browser....
]]>0Shashank Verma<![CDATA[Query Graphs with Optimized DePlot Model]]>http://www.open-lab.net/blog/?p=770032024-05-07T16:48:52Z2024-01-23T00:34:34ZNVIDIA AI Foundation Models and Endpoints provides access to a curated set of community and NVIDIA-built generative AI models to experience, customize, and...