AI Foundation Models

Mar 12, 2025
Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance
Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...
3 MIN READ

Feb 26, 2025
Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs
Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...
4 MIN READ

Feb 11, 2025
NVIDIA DGX Cloud Introduces Ready-To-Use Templates to Benchmark AI Platform Performance
In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a...
7 MIN READ

Jan 06, 2025
Llama Nemotron Models Accelerate Agentic AI Workflows with Accuracy and Efficiency
Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and...
8 MIN READ

Dec 18, 2024
NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference
Recurrent drafting (referred to as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...
6 MIN READ

Dec 17, 2024
Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner
Knowledge distillation is an approach for transferring the knowledge of a much larger teacher model to a smaller student model, ideally yielding a compact,...
5 MIN READ

Nov 21, 2024
Deploying Fine-Tuned AI Models with NVIDIA NIM
For organizations adapting AI foundation models with domain-specific data, the ability to rapidly create and deploy fine-tuned models is key to efficiently...
6 MIN READ

Nov 21, 2024
Spotlight: Advancing Autonomous Operations with AVEVA Dynamic Simulation and NVIDIA Raptor
Industrial engineers are turning to AI to build advanced process simulation solutions and accelerate progress toward fully autonomous operations in the energy,...
6 MIN READ

Nov 19, 2024
Llama 3.2 Full-Stack Optimizations Unlock High Performance on NVIDIA GPUs
Meta recently released its Llama 3.2 series of vision language models (VLMs), which come in 11B parameter and 90B parameter variants. These models are...
6 MIN READ

Oct 21, 2024
IBM’s New Granite 3.0 Generative AI Models Are Small, Yet Highly Accurate and Efficient
Today, IBM released the third generation of IBM Granite, a collection of open language models and complementary tools. Prior generations of Granite focused on...
5 MIN READ

Oct 09, 2024
Develop Academic and Industrial Applications with a New Specialized Math Model
Mathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science.
1 MIN READ

Oct 09, 2024
Boosting Llama 3.1 405B Throughput by Another 1.5x on NVIDIA H200 Tensor Core GPUs and NVLink Switch
The continued growth of LLMs capability, fueled by increasing parameter counts and support for longer contexts, has led to their usage in a wide variety of...
8 MIN READ

Oct 08, 2024
Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy
This post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading...
7 MIN READ

Oct 04, 2024
Just Released: NVIDIA TensorRT-LLM 0.13.0
Updates include tensor parallel support for Mamba2, sparse mixer normalization for MoE models, and more.
1 MIN READ

Oct 03, 2024
New Reward Model Helps Improve LLM Alignment with Human Preferences
Reinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the...
4 MIN READ

Oct 02, 2024
Accelerating LLMs with llama.cpp on NVIDIA RTX Systems
The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...
5 MIN READ