AI Foundation Models

Mar 12, 2025

Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...

3 MIN READ

An image of a phone with a chatbot dialog on the screen but also showing the inside of the phone.

Feb 26, 2025

Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...

4 MIN READ

Three icons in a row, including DGX in the middle.

Feb 11, 2025

NVIDIA DGX Cloud Introduces Ready-To-Use Templates to Benchmark AI Platform Performance

In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a...

7 MIN READ

Jan 06, 2025

Llama Nemotron Models Accelerate Agentic AI Workflows with Accuracy and Efficiency

Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and...

8 MIN READ

Dec 18, 2024

NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference

Recurrent drafting (referred to as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...

6 MIN READ

Icon image of a chart and search symbol, on a purple background.

Dec 17, 2024

Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner

Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact,...

5 MIN READ

Nov 21, 2024

Deploying Fine-Tuned AI Models with NVIDIA NIM

For organizations adapting AI foundation models with domain-specific data, the ability to rapidly create and deploy fine-tuned models is key to efficiently...

6 MIN READ

A person with a hard hat looks at a computer monitor, which is displaying graphs.

Nov 21, 2024

Spotlight: Advancing Autonomous Operations with AVEVA Dynamic Simulation and NVIDIA Raptor

Industrial engineers are turning to AI to build advanced process simulation solutions and accelerate progress toward fully autonomous operations in the energy,...

6 MIN READ

Nov 19, 2024

Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs

Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...

6 MIN READ

Oct 21, 2024

IBM’s New Granite 3.0 Generative AI Models Are Small, Yet Highly Accurate and Efficient

Today, IBM released the third generation of IBM Granite, a collection of open language models and complementary tools. Prior generations of Granite focused on...

5 MIN READ

Decorative image of stars in a geometric shape and colored pink, purple, blue, and green.

Oct 09, 2024

Develop Academic and Industrial Applications with a New Specialized Math Model

Mathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science.

1 MIN READ

Oct 09, 2024

Boosting Llama 3.1 405B Throughput by Another 1.5x on NVIDIA H200 Tensor Core GPUs and NVLink Switch

The continued growth of LLMs capability, fueled by increasing parameter counts and support for longer contexts, has led to their usage in a wide variety of...

8 MIN READ

Oct 08, 2024

Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy

This post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading...

7 MIN READ

Decorative image of an atomic model icon connected to a computer monitor.

Oct 04, 2024

Just Released: NVIDIA TensorRT-LLM 0.13.0

Updates include tensor parallel support for Mamba2, sparse mixer normalization for MoE models, and more.

1 MIN READ

Nemotron icon in front of multiple tiles with icons and three sliders each, in colors of green, purple, and grey.

Oct 03, 2024

New Reward Model Helps Improve LLM Alignment with Human Preferences

Reinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the...

4 MIN READ

Oct 02, 2024

Accelerating LLMs with llama.cpp on NVIDIA RTX Systems

The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...

5 MIN READ