As AI workloads grow in complexity and scale—from large language models (LLMs) to agentic AI reasoning and physical AI—the demand for faster, more scalable compute infrastructure has never been greater. Meeting these demands requires rethinking system architecture from the ground up. NVIDIA is advancing platform architecture with NVIDIA ConnectX-8 SuperNICs, the industry’s first SuperNIC to…
]]>In the era of generative AI, accelerated networking is essential to build high-performance computing fabrics for massively distributed AI workloads. NVIDIA continues to lead in this space, offering state-of-the-art Ethernet and InfiniBand solutions that maximize the performance and efficiency of AI factories and cloud data centers. At the core of these solutions are NVIDIA SuperNICs—a new…
]]>In the era of generative AI, where machines are not just learning from data but generating human-like text, images, video, and more, retrieval-augmented generation (RAG) stands out as a groundbreaking approach. A RAG workflow builds on large language models (LLMs), which can understand queries and generate responses. However, LLMs have limitations, including training complexity and a lack of…
]]>A SuperNIC is a type of network accelerator for AI cloud data centers that delivers robust and seamless connectivity between GPU servers.
]]>ChatGPT, Stable Diffusion, DALL-E, and similar applications have awakened the world to generative AI. ChatGPT is the fastest-growing application in history. The ease of use and impressive capabilities have attracted over a hundred million users in just a few months. Generative AI has created a sense of urgency for companies to reimagine their products and business models. As NVIDIA CEO Jensen…
]]>NVIDIA BlueField-3 data processing units (DPUs) are now in full production, and have been selected by Oracle Cloud Infrastructure (OCI) to achieve higher performance, better efficiency, and stronger security, as announced at NVIDIA GTC 2023. As a 400 Gb/s infrastructure compute platform, BlueField-3 enables organizations to deploy and operate data centers at massive scale.
]]>As enterprises continue to shift workloads to the cloud, some applications need to remain on-premises to maximize latency performance and meet security, data sovereignty, and compliance policies. Microsoft Azure Stack HCI is a hyperconverged infrastructure (HCI) stack delivered as an Azure service. Providing built-in security and manageability, Azure Stack HCI is ideally positioned to run…
]]>NVIDIA Zero Touch RoCE (ZTR) enables data centers to seamlessly deploy RDMA over Converged Ethernet (RoCE) without requiring any special switch configuration. Until recently, ZTR was optimal for only small to medium-sized data centers. Meanwhile, large-scale deployments have traditionally relied on Explicit Congestion Notification (ECN) to enable RoCE network transport…
]]>The growing prevalence of GPU-accelerated computing in the cloud, enterprise, and at the edge increasingly relies on robust and powerful network infrastructures. NVIDIA ConnectX SmartNICs and NVIDIA BlueField DPUs provide high-throughput, low-latency connectivity that enables the scaling of GPU resources across a fleet of nodes. To address the demand for cloud-native AI workloads…
]]>NVIDIA accelerated switching and packet processing (ASAP2) technology is becoming ubiquitous to supercharging networking and security for the most demanding applications. Modern data center networks are increasingly becoming virtualized and provisioned as a service. These software-defined networks (SDN) deliver great flexibility and control, enabling you to easily scale from the premises of…
]]>Cloud technologies are increasingly taking over the worldwide IT infrastructure market. With offerings that include elastic compute, storage, and networking, cloud service providers (CSPs) allow customers to rapidly scale their IT infrastructure up and down without having to build and manage it on their own. The increasing demand for differentiated and cost-effective cloud products and services is…
]]>This post was originally published on the Mellanox blog. In my previous Kubernetes post, Provision Bare-Metal Kubernetes Like a Cloud Giant!, I discussed the benefits of using BlueField DPU-programmable SmartNICs to simplify provisioning of Kubernetes clusters in bare-metal infrastructures. A key takeaway from this post was the current rapid shift toward bare metal Kubernetes…
]]>