Getting Started With NVIDIA TAO

NVIDIA TAO gives you a low-code, open-source AI framework for accelerating your vision AI model development that’s suitable for all skill levels—from beginners to expert data scientists. Now, you can use the power and efficiency of transfer learning to achieve state-of-the-art accuracy and production-class throughput in record time with adaptation and optimization.


Download TAO

NVIDIA TAO Version 5.5: What’s New

This latest release of NVIDIA TAO delivers several foundation and multi-modal models to accelerate your AI development. These models and features unlock new potential and unleash tremendous productivity gains in vision AI.


  • Explore new foundation and multi-modal models:
    • Grounding-DINO—Open vocabulary object detection with fine-tuning
    • Mask-Grounding-DINO—Open vocabulary instance segmentation with fine-tuning
    • NV-CLIP—Foundation model for image and text embedding
    • BEVFusion—Sensor fusion model combining image and lidar data for 3D understanding with fine-tuning
    • SEGIC—In-context segmentation on any object based on visual prompting
    • Foundation Pose—Six DoF object pose estimation for any novel objects
    • Mask2Former—State-of-the-art instance and panoptic segmentation model with fine-tuning
  • Automatically create label datasets for object detection and segmentation using text prompts.
  • Knowledge distillation—Create smaller efficient and accurate networks from distilling knowledge of larger networks.

Production-Ready Vision AI

NVIDIA TAO is also available as a part of NVIDIA AI Enterprise, an end-to-end, secure, cloud-native AI software platform optimized to accelerate enterprises to the leading edge of AI.

With enterprise-grade security, stability, manageability, and support, NVIDIA AI Enterprise speeds time to value while mitigating the potential risks of open-source software. This ensures business continuity and a reliable platform for running mission-critical AI applications.

Benefits of using TAO with NVIDIA AI Enterprise include:


  • Access to exclusive foundation models for vision AI that can be fine-tuned for custom vision AI tasks
  • Validation and integration for NVIDIA AI open-source software
  • Access to AI solution workflows to speed time to production
  • Certifications to deploy AI everywhere
  • Enterprise-grade support, security, manageability, and API stability to mitigate potential risks of open source software

Learn More

Helpful Resources

New Foundational Models and Training Capabilities

Learn the groundbreaking features in TAO 5.5., including multimodal sensor fusion models, auto-labeling with text prompts, open-vocabulary detection, and more.

Read the Blog

Prompt-Based Auto Labeling

Learn how to use the prompt-based auto-labeling tool for object detection and segmentation to significantly reduce your time creating labeled dataset.

Watch the Video

AI Training With Multi-Modal Foundation Models

Learn how to create purpose-built AI using vision and language with multi-modal foundation models.

Watch the GTC Talk


Developer Starter Resources


Training Notebooks & Containers

Sample Deployment Applications

To convert TAO model (etlt) to an NVIDIA TensorRT? engine for deployment with DeepStream, select the appropriate TAO converter for your hardware and software stack.

Additional Resources


Product Support

Ethical AI

NVIDIA platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Also, work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.