• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • NVIDIA DGX Cloud Serverless Inference

    NVIDIA DGX? Cloud Serverless Inference is a high-performance, serverless AI inference solution that accelerates AI innovation with auto-scaling, cost-efficient GPU utilization, multi-cloud flexibility, and seamless scalability.

    Sign UpDocumentation


    NVIDIA DGX Cloud Serverless Inference Demo Video

    Simplify and Scale Inference

    See how NVIDIA DGX Cloud Serverless Inference, accelerated by NVIDIA Cloud Functions (NVCF), simplifies AI workload deployment across multiple regions with seamless auto-scaling, load balancing, and event-driven execution. You can bring your own models, containers, or Helm charts and instantly integrate with NVIDIA GPUs in DGX Cloud or partner infrastructure.


    How NVIDIA DGX Cloud Serverless Inference Works

    AI builders can easily package and deploy inference pipelines or data preprocessing workflows in containers optimized for NVIDIA GPUs—without worrying about underlying infrastructure. With flexible deployment options via API, CLI, or UI and built-in features like autoscaling, monitoring, and secret management, DGX Cloud Serverless Inference with NVCF lets you focus on developing and fine-tuning your AI models while it handles resource management.

    NVCF can provision and deploy applications and containers on DGX Cloud or through NVIDIA cloud partners (NCPs).

    img-alt-text

    Introductory Blog

    Learn how NVCF deploys, manages, and serves GPU-powered containerized applications across multiple regions and in the data center.

    Quick-Start Guide

    Learn how to deploy an NVIDIA NIM? microservice in five minutes using NIM APIs across popular application frameworks on NVIDIA DGX Cloud Serverless Inference.

    Introductory Demo

    Discover intelligent, interactive avatars for customer service created by the NVIDIA Tokkio reference workflow and deployed with NVIDIA DGX Cloud Serverless Inference.


    NVIDIA DGX Cloud Serverless Inference Key Features

    Auto-Scaling to Zero

    With NVIDIA DGX Cloud Serverless Inference, you can scale down to zero instances during periods of inactivity to optimize resource utilization and reduce costs. There’s no extra cost for cold-boot start times, and the system is optimized to minimize them.

    BYO Observability

    NVIDIA DGX Cloud Serverless Inference is powered by NVCF, which offers robust observability features. It allows you to integrate your preferred monitoring tools, such as Splunk, for comprehensive insights into your AI workloads.

    Broad Workload Support

    NVCF offers flexible deployment options for NIM microservices, while allowing you to bring your own containers, models, and Helm charts. By hosting these assets within the NGC? Private Registry, you can seamlessly create and manage functions tailored to your specific AI workloads.

    Targeted Deployment

    NVCF supports targeted deployment, providing you with the flexibility to choose instance types with specific characteristics, such as number of GPUs, number of CPU cores, CPU architecture, storage, and geographical location.


    Get Started With NVIDIA DGX Cloud Serverless Inference

    NVIDIA Build on DGX Cloud

    Try

    Experience leading models on NVIDIA build, accelerated by NVIDIA DGX Cloud Serverless Inference.

    Try NVIDIA NIM APIs
    DGX Cloud Serverless Inference Trial

    Sign Up

    Get a 30-day preview of NVIDIA DGX Cloud Serverless Inference, powered by NVIDIA Cloud Functions.

    Request a 30-Day Preview

    NVIDIA DGX Cloud Serverless Inference Learning Library

    Research

    AI Inference

    What is AI Inference?

    Learn how AI inference generates outputs from a model based on inputs like images, text, or video to enable applications like weather forecasts or LLM conversations.

    Documentation

    Build Agentic AI With NVIDIA NIM

    Learn more about NVIDIA NIM.

    Explore technical documentation to start prototyping and building your enterprise AI applications with NVIDIA APIs, or scale on your own infrastructure with NVIDIA NIM.

    Documentation

    NVIDIA Cloud Functions (NVCF)

    Learn more about NVCF.

    Check out NVIDIA Cloud Functions (NVCF), a serverless API that deploys and manages AI workloads on GPUs, providing security, scale, and reliability to your workloads.

    Tech Blog

    NVIDIA-Optimized Code for Popular LLMs

    Learn more about NVIDIA AI Foundation models and endpoints.

    Get tips in this tech blog for generating code, answering queries, and translating text on Llama, Kosmos-2, and SeamlessM4T with NVIDIA AI Foundation models.

    Tech Blog

    NVIDIA Core SDKs With Direct Access to NVIDIA GPUs

    Explore the NVIDIA API catalog.

    Visit the NVIDIA API catalog to experience models optimized to deliver the best performance on NVIDIA-accelerated infrastructure directly from your browser or connect to NVIDIA-hosted endpoints.

    Tech Blog

    Gauge AI Workload Performance

    Learn about NVIDIA DGX Cloud benchmarking recipes.

    Evaluate performance of deep learning models across any GPU-based infrastructure—on premises or in the cloud—with featured recipes in the DGX Cloud Benchmarking Collection.


    DGX Cloud Serverless Inference Ecosystem


    More Resources

    NVIDIA Developer Forums

    NVIDIA DGX Cloud Serverless Inference FAQ

    NVIDIA Training and Certification

    Get Training and Certification

    NVIDIA Inception Program for Startups

    Join the NVIDIA Developer Program


      Ethical AI

        NVIDIA believes trustworthy AI is a shared responsibility, and we have established policies and practices to support the development of AI across a wide array of applications. When downloading or using this model in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

        For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.

        Get started with NVIDIA DGX Cloud Serverless Inference today.

        Request a POC

        人人超碰97caoporen国产