Data Storage – NVIDIA Technical Blog

Data Storage – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-30T21:51:33Z http://www.open-lab.net/blog/feed/ Taylor Allison <![CDATA[Accelerating AI Storage by up to 48% with NVIDIA Spectrum-X Networking Platform and Partners]]> http://www.open-lab.net/blog/?p=95432 2025-04-23T02:48:15Z 2025-02-04T15:00:00Z

AI factories rely on more than just compute fabrics. While the East-West network connecting the GPUs is critical to AI application performance, the storage...]]>

AI factories rely on more than just compute fabrics. While the East-West network connecting the GPUs is critical to AI application performance, the storage...

data-center

AI factories rely on more than just compute fabrics. While the East-West network connecting the GPUs is critical to AI application performance, the storage fabric��connecting high-speed storage arrays��is equally important. Storage performance plays a key role across several stages of the AI lifecycle, including training checkpointing, inference techniques such as retrieval-augmented generation��

]]> 0 Rob Davis <![CDATA[Supermicro Launches NVIDIA BlueField-Powered JBOF to Optimize AI Storage]]> http://www.open-lab.net/blog/?p=90242 2024-10-17T18:18:58Z 2024-10-15T16:35:00Z

The growth of AI is driving exponential growth in computing power and a doubling of networking speeds every few years. Less well-known is that it��s also...]]>

The growth of AI is driving exponential growth in computing power and a doubling of networking speeds every few years. Less well-known is that it��s also... Image of the Supermicro JBOF on a black background.

Image of the Supermicro JBOF on a black background.

The growth of AI is driving exponential growth in computing power and a doubling of networking speeds every few years. Less well-known is that it��s also putting new demands on storage. Training new models typically requires high-bandwidth networked access to petabytes of data, while inference with the latest types of retrieval augmented generation (RAG) requires low-latency access to��

]]> 1 Tim Lustig <![CDATA[Spotlight: NVIDIA BlueField DPUs Power the VAST Data Platform for AI Workload Optimization]]> http://www.open-lab.net/blog/?p=85204 2024-10-11T20:02:14Z 2024-08-06T18:00:00Z

As the demand for sophisticated AI capabilities escalates, VAST Data introduces the VAST Data Platform, now enhanced with NVIDIA BlueField DPUs. This innovation...]]>

As the demand for sophisticated AI capabilities escalates, VAST Data introduces the VAST Data Platform, now enhanced with NVIDIA BlueField DPUs. This innovation...

computer-system-components

As the demand for sophisticated AI capabilities escalates, VAST Data introduces the VAST Data Platform, now enhanced with NVIDIA BlueField DPUs. This innovation is tailored to meet the stringent demands of AI-driven data centers and optimize AI workloads and data management. This post presents how BlueField DPUs provide VAST with a significant boost in both performance and efficiency to��

]]> 0 Tim Lustig <![CDATA[Accelerate AI Infrastructure Using an NVIDIA BlueField-3 DPU Integration with DDN Storage]]> http://www.open-lab.net/blog/?p=84544 2024-08-08T18:48:33Z 2024-07-23T16:00:00Z

As AI becomes integral to organizational innovation and competitive advantage, the need for efficient and scalable infrastructure is more critical than ever. A...]]>

As AI becomes integral to organizational innovation and competitive advantage, the need for efficient and scalable infrastructure is more critical than ever. A... Decorative image of a car driving at night with light streaks.

Decorative image of a car driving at night with light streaks.

As AI becomes integral to organizational innovation and competitive advantage, the need for efficient and scalable infrastructure is more critical than ever. A partnership between NVIDIA and DDN Storage is setting new standards in this area. By integrating NVIDIA BlueField DPUs into DDN EXAScaler and DDN Infinia and using them innovatively, DDN Storage is transforming data-centric workloads.

]]> 0 Nicola Sessions <![CDATA[Bolstering Cybersecurity: How Large Language Models and Generative AI are Transforming Digital Security]]> http://www.open-lab.net/blog/?p=73728 2023-12-14T19:27:39Z 2023-11-27T09:00:00Z

Identity-based attacks are on the rise, with phishing remaining the most common and second-most expensive attack vector. Some attackers are using AI to craft...]]>

Identity-based attacks are on the rise, with phishing remaining the most common and second-most expensive attack vector. Some attackers are using AI to craft... Illustration representing cybersecurity.

Illustration representing cybersecurity.

Identity-based attacks are on the rise, with phishing remaining the most common and second-most expensive attack vector. Some attackers are using AI to craft more convincing phishing messages and deploying bots to get around automated defenses designed to spot suspicious behavior. At the same time, a continued increase in enterprise applications introduces challenges for IT teams who must��

]]> 0 Aviv Dahan <![CDATA[Maximizing Network Performance for Storage with NVIDIA Spectrum Ethernet]]> http://www.open-lab.net/blog/?p=66884 2023-11-29T23:19:54Z 2023-06-26T15:00:00Z

As data generation continues to increase, linear performance scaling has become an absolute requirement for scale-out storage. Storage networks are like car...]]>

As data generation continues to increase, linear performance scaling has become an absolute requirement for scale-out storage. Storage networks are like car...

data-center

As data generation continues to increase, linear performance scaling has become an absolute requirement for scale-out storage. Storage networks are like car roadway systems: if the road is not built for speed, the potential speed of a car does not matter. Even a Ferrari is slow on an unpaved dirt road full of obstacles. Scale-out storage performance can be hindered by the Ethernet fabric��

]]> 0 Pradyumna Desale <![CDATA[Announcing NVIDIA DGX GH200: The First 100 Terabyte GPU Memory System]]> http://www.open-lab.net/blog/?p=65526 2023-12-06T22:09:47Z 2023-05-29T03:30:00Z

At COMPUTEX 2023, NVIDIA announced the NVIDIA DGX GH200, which marks another breakthrough in GPU-accelerated computing to power the most demanding giant AI...]]>

At COMPUTEX 2023, NVIDIA announced the NVIDIA DGX GH200, which marks another breakthrough in GPU-accelerated computing to power the most demanding giant AI...

nvidia-dgx-gh200

At COMPUTEX 2023, NVIDIA announced the NVIDIA DGX GH200, which marks another breakthrough in GPU-accelerated computing to power the most demanding giant AI workloads. In addition to describing critical aspects of the NVIDIA DGX GH200 architecture, this post discusses how NVIDIA Base Command enables rapid deployment, accelerates the onboarding of users, and simplifies system management.

]]> 0 David Wills <![CDATA[Transforming IPsec Deployments with NVIDIA DOCA 2.0]]> http://www.open-lab.net/blog/?p=64076 2023-06-13T17:10:25Z 2023-05-09T16:30:00Z

Announced in March 2023, NVIDIA DOCA 2.0, the newest release of the NVIDIA SDK for BlueField DPUs, is now available. Together, NVIDIA DOCA and BlueField DPUs...]]>

Announced in March 2023, NVIDIA DOCA 2.0, the newest release of the NVIDIA SDK for BlueField DPUs, is now available. Together, NVIDIA DOCA and BlueField DPUs...

NVIDIA DOCA 2.0

Announced in March 2023, NVIDIA DOCA 2.0, the newest release of the NVIDIA SDK for BlueField DPUs, is now available. Together, NVIDIA DOCA and BlueField DPUs accelerate the development of applications that deliver breakthrough networking, security, and storage performance with a comprehensive, open development platform. NVIDIA DOCA 2.0 includes newly added support for the BlueField-3 Data��

]]> 0 Andr�� Franklin <![CDATA[Tips on Scaling Storage for AI Training and Inferencing]]> http://www.open-lab.net/blog/?p=60056 2023-07-27T19:52:33Z 2023-01-25T21:32:08Z

There are many benefits of GPUs in scaling AI, ranging from faster model training to GPU-accelerated fraud detection. While planning AI models and deployed...]]>

There are many benefits of GPUs in scaling AI, ranging from faster model training to GPU-accelerated fraud detection. While planning AI models and deployed... An image with stacked circular objects that grow with each iteration.

An image with stacked circular objects that grow with each iteration.

There are many benefits of GPUs in scaling AI, ranging from faster model training to GPU-accelerated fraud detection. While planning AI models and deployed apps, scalability challenges��especially performance and storage��must be accounted for. Regardless of the use case, AI solutions have four elements in common: Of these elements, data storage is often the most neglected during��

]]> 1 Scot Schultz <![CDATA[Ushering In a New Era of HPC and Supercomputing Performance with DPUs]]> http://www.open-lab.net/blog/?p=54899 2023-10-25T23:52:51Z 2022-09-20T16:45:00Z

Supercomputers are used to model and simulate the most complex processes in scientific computing, often for insight into new discoveries that otherwise would be...]]>

Supercomputers are used to model and simulate the most complex processes in scientific computing, often for insight into new discoveries that otherwise would be...

dpus-for-hpc-featured

Supercomputers are used to model and simulate the most complex processes in scientific computing, often for insight into new discoveries that otherwise would be impractical or impossible to demonstrate physically. The NVIDIA BlueField data processing unit (DPU) is transforming high-performance computing (HPC) resources into more efficient systems, while accelerating problem solving across a��

]]> 0 Andr�� Franklin <![CDATA[Choosing the Right Storage for Enterprise AI Workloads]]> http://www.open-lab.net/blog/?p=50453 2023-06-12T09:13:50Z 2022-07-21T16:00:00Z

Artificial intelligence (AI) is becoming pervasive in the enterprise. Speech recognition, recommenders, and fraud detection are just a few applications among...]]>

Artificial intelligence (AI) is becoming pervasive in the enterprise. Speech recognition, recommenders, and fraud detection are just a few applications among...

ai-for-dev-blog-networking-magnum-io_opt1-1600x900 (1)

Artificial intelligence (AI) is becoming pervasive in the enterprise. Speech recognition, recommenders, and fraud detection are just a few applications among hundreds being driven by AI and deep learning (DL) To support these AI applications, businesses look toward optimizing AI servers and performance networks. Unfortunately, storage infrastructure requirements are often overlooked in the��

]]> 0 Aviv Barnea <![CDATA[Scaling Zero Touch RoCE Technology with Round Trip Time Congestion Control]]> http://www.open-lab.net/blog/?p=41691 2022-08-21T23:53:09Z 2021-12-14T22:10:52Z

NVIDIA Zero Touch RoCE (ZTR) enables data centers to seamlessly deploy RDMA over Converged Ethernet (RoCE) without requiring any special switch configuration....]]>

NVIDIA Zero Touch RoCE (ZTR) enables data centers to seamlessly deploy RDMA over Converged Ethernet (RoCE) without requiring any special switch configuration.... Zero Touch RoCE enables a smooth data highway

Zero Touch RoCE enables a smooth data highway

NVIDIA Zero Touch RoCE (ZTR) enables data centers to seamlessly deploy RDMA over Converged Ethernet (RoCE) without requiring any special switch configuration. Until recently, ZTR was optimal for only small to medium-sized data centers. Meanwhile, large-scale deployments have traditionally relied on Explicit Congestion Notification (ECN) to enable RoCE network transport��

]]> 15 Joaquin Anton Guirao <![CDATA[Rapid Data Pre-Processing with NVIDIA DALI]]> http://www.open-lab.net/blog/?p=38139 2022-08-21T23:52:47Z 2021-10-07T17:30:00Z

This post is an update to an older post. Deep learning models require training with vast amounts of data to achieve accurate results. Raw data usually cannot be...]]>

This post is an update to an older post. Deep learning models require training with vast amounts of data to achieve accurate results. Raw data usually cannot be...

dali-social-tw-li-2048x1024

This post is an update to an older post. Deep learning models require training with vast amounts of data to achieve accurate results. Raw data usually cannot be directly fed into a neural network due to various reasons such as different storage formats, compression, data format and size, and limited amount of high-quality data. Addressing these issues requires extensive data preparation��

]]> 0 Aviv Dahan <![CDATA[Proving Superior Cloud, AI, and Storage Performance with NVIDIA Spectrum-3 Switches]]> http://www.open-lab.net/blog/?p=37844 2023-02-13T18:22:09Z 2021-09-23T01:15:18Z

Does the switch matter? The network fabric is key to the performance of modern data centers. There are many requirements for data center switches, but the most...]]>

Does the switch matter? The network fabric is key to the performance of modern data centers. There are many requirements for data center switches, but the most...

switch-chip

Does the switch matter? The network fabric is key to the performance of modern data centers. There are many requirements for data center switches, but the most basic is to provide equal amounts of bandwidth to all clients so that resources are shared evenly. Without fair networking, all workloads experience unpredictable performance due to throughput deterioration, delay��

]]> 0 John F. Kim <![CDATA[Accelerating Solution Development with DOCA on NVIDIA BlueField DPUs]]> http://www.open-lab.net/blog/?p=29855 2023-03-22T01:11:57Z 2021-04-12T19:13:00Z

DOCA is a software framework for developing applications on BlueField DPUs. By using DOCA, you can offload infrastructure workloads from the host CPU and...]]>

DOCA is a software framework for developing applications on BlueField DPUs. By using DOCA, you can offload infrastructure workloads from the host CPU and...

DPU-Social-1436967-BF-2-devblog-2000x1333

DOCA is a software framework for developing applications on BlueField DPUs. By using DOCA, you can offload infrastructure workloads from the host CPU and accelerate them with the BlueField DPU. This enables an infrastructure that is software-defined yet hardware accelerated, maximizing both performance and flexibility in the data center. NVIDIA first introduced DOCA in October 2020.

]]> 0 Itay Ozery <![CDATA[Securing and Accelerating Cloud Computing Platforms with NVIDIA BlueField-2 DPUs]]> http://www.open-lab.net/blog/?p=21399 2023-03-22T01:09:06Z 2020-10-05T13:00:00Z

Cloud technologies are increasingly taking over the worldwide IT infrastructure market. With offerings that include elastic compute, storage, and networking,...]]>

Cloud technologies are increasingly taking over the worldwide IT infrastructure market. With offerings that include elastic compute, storage, and networking,...

dpu-featured

Cloud technologies are increasingly taking over the worldwide IT infrastructure market. With offerings that include elastic compute, storage, and networking, cloud service providers (CSPs) allow customers to rapidly scale their IT infrastructure up and down without having to build and manage it on their own. The increasing demand for differentiated and cost-effective cloud products and services is��

]]> 1 Adam Thompson <![CDATA[GPUDirect Storage: A Direct Path Between Storage and GPU Memory]]> http://www.open-lab.net/blog/?p=15376 2022-08-21T23:39:34Z 2019-08-06T13:00:31Z

As AI and HPC datasets continue to increase in size, the time spent loading data for a given application begins to place a strain on the total application��s...]]>

As AI and HPC datasets continue to increase in size, the time spent loading data for a given application begins to place a strain on the total application��s...

GPUDirect Fig 1 New

As AI and HPC datasets continue to increase in size, the time spent loading data for a given application begins to place a strain on the total application��s performance. When considering end-to-end application performance, fast GPUs are increasingly starved by slow I/O. I/O, the process of loading data from storage to GPUs for processing, has historically been controlled by the CPU.

]]> 7 Joaquin Anton Guirao <![CDATA[Fast AI Data Preprocessing with NVIDIA DALI]]> http://www.open-lab.net/blog/?p=13395 2022-08-21T23:39:18Z 2019-01-28T18:16:54Z

Editor's Note: This post has been updated. Here is the revised post. Training deep learning models with vast amounts of data is necessary to achieve accurate...]]>

Editor's Note: This post has been updated. Here is the revised post. Training deep learning models with vast amounts of data is necessary to achieve accurate...

Editor��s Note: This post has been updated. Here is the revised post. Training deep learning models with vast amounts of data is necessary to achieve accurate results. Data in the wild, or even prepared data sets, is usually not in the form that can be directly fed into neural network. This is where NVIDIA DALI data preprocessing comes into play. There are various reasons for that��

]]> 0 James Mauro <![CDATA[Storage Performance Basics for Deep Learning]]> http://www.open-lab.net/blog/?p=9795 2022-08-21T23:38:47Z 2018-03-21T13:00:34Z

Introduction When production systems are not delivering expected levels of performance, it can be a challenging and time-consuming task to root-cause the...]]>

Introduction When production systems are not delivering expected levels of performance, it can be a challenging and time-consuming task to root-cause the...

dgx-1-thumb

When production systems are not delivering expected levels of performance, it can be a challenging and time-consuming task to root-cause the issue(s). Especially in today��s complex environments, where the workload is comprised of many software components, libraries, etc, and rely on virtually all of the underlying hardware subsystems (CPU, memory, disk IO, network IO) to deliver maximum throughput.

]]> 6 ��˳��97caoporen��