As models grow larger and are trained on more data, they become more capable, making them more useful. To train these models quickly, more performance, delivered at data center scale, is required. The NVIDIA Blackwell platform, launched at GTC 2024 and now in full production, integrates seven types of chips: GPU, CPU, DPU, NVLink Switch chip, InfiniBand Switch, and Ethernet Switch.
]]>In the era of generative AI, accelerated networking is essential to build high-performance computing fabrics for massively distributed AI workloads. NVIDIA continues to lead in this space, offering state-of-the-art Ethernet and InfiniBand solutions that maximize the performance and efficiency of AI factories and cloud data centers. At the core of these solutions are NVIDIA SuperNICs��a new��
]]>Inferencing for generative AI and AI agents will drive the need for AI compute infrastructure to be distributed from edge to central clouds. IDC predicts that ��Business AI (consumer excluded) will contribute $19.9 trillion to the global economy and account for 3.5% of GDP by 2030.�� 5G networks must also evolve to serve this new incoming AI traffic. At the same time, there is an opportunity��
]]>NVIDIA DOCA SDK and acceleration framework empowers developers with extensive libraries, drivers, and APIs to create high-performance applications and services for NVIDIA BlueField DPUs and ConnectX SmartNICs. It fuels data center innovation, enabling rapid application deployment. With comprehensive features, NVIDIA DOCA serves as a one-stop-shop for BlueField developers looking to accelerate��
]]>The NVIDIA DOCA framework aims to simplify the programming and application development for NVIDIA BlueField DPUs and ConnectX SmartNICs. It provides high-level abstraction building blocks relevant to network applications through an SDK, runtime binaries, and high-level APIs that enable developers to rapidly create applications and services. NVIDIA DOCA Flow is a newly updated set of software��
]]>Real-time processing of network traffic can be leveraged by the high degree of parallelism GPUs offer. Optimizing packet acquisition or transmission in these types of applications avoids bottlenecks and enables the overall execution to keep up with high-speed networks. In this context, DOCA GPUNetIO promotes the GPU as an independent component that can exercise network and compute tasks without��
]]>Oracle is one of the top cloud service providers in the world, supporting over 22,000 customers and reporting revenue of nearly $4 billion per quarter and annual growth of greater than 40%. Oracle Cloud Infrastructure (OCI) is growing at an even faster rate and offers a complete cloud infrastructure for every workload. Having added 11 regions in the last 18 months, OCI currently offers 41��
]]>In MLPerf Inference v3.0, NVIDIA made its first submissions to the newly introduced Network division, which is now part of the MLPerf Inference Datacenter suite. The Network division is designed to simulate a real data center setup and strives to include the effect of networking��including both hardware and software��in end-to-end inference performance. In the Network division��
]]>The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment scenarios. High-performance, accelerated AI platforms are needed to meet the demands of these applications and deliver the best user experiences. New AI models are constantly being invented to enable new capabilities��
]]>As the GPU launches threads, dispatches kernels, and loads from memory, the CPU feeds it data asynchronously, accesses network communications, manages system resources, and more. This is just a snippet of hardware activity needed to run an application��an orchestra of different components operating in perfect parallelism. As a developer, you are the conductor of an orchestra of hardware��
]]>Specialists in moving data in data centers, DPUs, or data processing units, are a new class of programmable processor and will join CPUs and GPUs as one of the three pillars of computing.
]]>This post was updated May 8, 2023. A growing number of network applications need to exercise GPU real-time packet processing in order to implement high data rate solutions: data filtering, data placement, network analysis, sensors�� signal processing, and more. One primary motivation is the high degree of parallelism that the GPU can enable to process in parallel multiple packets while��
]]>A SmartNIC is a programmable accelerator that makes data center networking, security and storage efficient and flexible.
]]>The incredible increase of traffic within data centers along with increased adoption of virtualization is placing strains on the traditional data centers. Customarily, virtual machines rely on software interfaces such as VirtIO to connect with the hypervisor. Although VirtIO is significantly more flexible compared to SR-IOV, it can use up to 50% more compute power in the host��
]]>Cloud computing is designed to be agile and resilient to deliver additional value for businesses. China Mobile (CMCC), one of China��s largest telecom operators and cloud services providers, offers precisely this with its Bigcloud public cloud offering. Bigcloud provides PaaS and SaaS services tailored to the needs of enterprise cloud and hybrid-cloud solutions for mission-critical��
]]>The inline processing of network packets using GPUs is a packet-analysis technique useful to a number of different application domains: signal processing, network security, information gathering, input reconstruction, and so on. The main requirement of these application types is to move received packets into GPU memory as soon as possible, to trigger the CUDA kernel responsible to execute��
]]>The NVIDIA BlueField-2 data processing unit (DPU) delivers unmatched software-defined networking (SDN) performance, programmability, and scalability. It integrates eight Arm CPU cores, the secure and advanced ConnectX-6 Dx cloud network interface, and hardware accelerators that together offload, accelerate, and isolate SDN functions, performing connection tracking, flow matching��
]]>This post was originally published on the Mellanox blog. Everyone is talking about data processing unit�Cbased SmartNICs but without answering one simple question: What is a SmartNIC and what do they do? NIC stands for network interface card. Practically speaking, a NIC is a PCIe card that plugs into a server or storage box to enable connectivity to an Ethernet network.
]]>NVIDIA ConnectX NIC enables precise timekeeping for social network��s mission-critical distributed applications Facebook is open-sourcing the Open Compute Project Time Appliance Project (OCP TAP), which provides very precise time keeping and time synchronization across data centers in a cost-effective manner. The solution includes a Time Card that can turn almost any commercial off-the-shelf��
]]>In 2020, many of us adopted a work-from-home routine, and this new norm has been stressing IT networks. It shouldn��t be a surprise that the sudden boost in remote working drives the need for a more dynamic IT environment, one that can pull in resources on demand. Over the past few years, we��ve focused on the Media & Entertainment (M&E) market, supporting the global industry as it evolves from��
]]>This post was originally published on the Mellanox blog. At Red Hat Summit 2018, NVIDIA Mellanox announced an open network functions virtualization infrastructure (NFVI) and cloud data center solution. The solution combined Red Hat Enterprise Linux cloud software with in-box support of NVIDIA Mellanox NIC hardware. Our close collaboration and joint validation with Red Hat yielded a fully��
]]>This post was originally published on the Mellanox blog. XDP (eXpress Data Path) is a programmable data path in the Linux kernel network stack. It provides a framework to BPF and can enable high performance packet processing at runtime. XDP works in concert with the Linux network stack and is not a kernel bypass. Because XDP runs in the kernel network driver��
]]>NVIDIA announced a new technology embedded in its NVIDIA Mellanox ConnectX-6 Dx SmartNIC and BlueField-2 I/O Processing Unit to optimize 5G networks. Referred to as 5T-for-5G, or time-triggered transmission technology for telco, this new technology delivers superbly accurate time synchronization across front-haul and mid-haul networks, providing telecommunications providers with higher��
]]>