Multi-Instance GPU (MIG) is an important feature of NVIDIA H100, A100, and A30 Tensor Core GPUs, as it can partition a GPU into multiple instances. Each instance has its own compute cores, high-bandwidth memory, L2 cache, DRAM bandwidth, and media engines such as decoders. This enables multiple workloads or multiple users to run workloads simultaneously on one GPU to maximize the GPU…
]]>For scalable data center performance, NVIDIA GPUs have become a must-have. NVIDIA GPU parallel processing capabilities, supported by thousands of computing cores, are essential to accelerating a wide variety of applications across different industries. The most compute-intensive applications across diverse industries use GPUs today: Different applications across this spectrum can…
]]>Monitoring GPUs is critical for infrastructure or site reliability engineering (SRE) teams who manage large-scale GPU clusters for AI or HPC workloads. GPU metrics allow teams to understand workload behavior and thus optimize resource allocation and utilization, diagnose anomalies, and increase overall data center efficiency. Apart from infrastructure teams, you might also be interested in metrics…
]]>The new NVIDIA A100 GPU based on the NVIDIA Ampere GPU architecture delivers the greatest generational leap in accelerated computing. The A100 GPU has revolutionary hardware capabilities and we’re excited to announce CUDA 11 in conjunction with A100. CUDA 11 enables you to leverage the new hardware capabilities to accelerate HPC, genomics, 5G, rendering, deep learning, data analytics…
]]>Editor’s note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, “How to Easily use GPUs with Kubernetes”. Over the last few years, NVIDIA has leveraged GPU containers in a variety of ways for testing, development and running AI workloads in production at scale. Containers optimized for NVIDIA GPUs and systems such as the DGX and OEM NGC-Ready servers are available…
]]>NVIDIA uses containers to develop, test, benchmark, and deploy deep learning (DL) frameworks and HPC applications. We wrote about building and deploying GPU containers at scale using NVIDIA-Docker roughly two years ago. Since then, NVIDIA-Docker has been downloaded close to 2 million times. A variety of customers used NVIDIA-Docker to containerize and run GPU accelerated workloads.
]]>