Magnum IO – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-10-25T21:20:44Z http://www.open-lab.net/blog/feed/ Giuseppe Congiu <![CDATA[Memory Efficiency, Faster Initialization, and Cost Estimation with NVIDIA Collective Communications Library 2.22]]> http://www.open-lab.net/blog/?p=87077 2024-09-19T19:30:36Z 2024-09-17T00:31:08Z For the past few months, the NVIDIA Collective Communications Library (NCCL) developers have been working hard on a set of new library features and bug fixes....]]> For the past few months, the NVIDIA Collective Communications Library (NCCL) developers have been working hard on a set of new library features and bug fixes....Decorative image of a cube of green cubes, surrounded by other cubes on a dark background.

For the past few months, the NVIDIA Collective Communications Library (NCCL) developers have been working hard on a set of new library features and bug fixes. In this post, we discuss the details of the NCCL 2.22 release and the pain points addressed. NVIDIA Magnum IO NCCL is a library designed to optimize inter-GPU and multi-node communication, crucial for efficient parallel computing��

Source

]]>
0
Akhil Langer <![CDATA[Enhancing Application Portability and Compatibility across New Platforms Using NVIDIA Magnum IO NVSHMEM 3.0]]> http://www.open-lab.net/blog/?p=88550 2024-09-19T19:34:01Z 2024-09-06T20:30:09Z NVSHMEM is a parallel programming interface that provides efficient and scalable communication for NVIDIA GPU clusters. Part of NVIDIA Magnum IO and based on...]]> NVSHMEM is a parallel programming interface that provides efficient and scalable communication for NVIDIA GPU clusters. Part of NVIDIA Magnum IO and based on...

NVSHMEM is a parallel programming interface that provides efficient and scalable communication for NVIDIA GPU clusters. Part of NVIDIA Magnum IO and based on OpenSHMEM, NVSHMEM creates a global address space for data that spans the memory of multiple GPUs and can be accessed with fine-grained GPU-initiated operations, CPU-initiated operations, and operations on CUDA streams.

Source

]]>
0
Stefan Maintz <![CDATA[Optimize Energy Efficiency of Multi-Node VASP Simulations with NVIDIA Magnum IO]]> http://www.open-lab.net/blog/?p=72724 2023-11-20T18:42:51Z 2023-11-13T16:00:00Z Computational energy efficiency has become a primary decision criterion for most supercomputing centers. Data centers, once built, are capped in terms of the...]]> Computational energy efficiency has become a primary decision criterion for most supercomputing centers. Data centers, once built, are capped in terms of the...

Computational energy efficiency has become a primary decision criterion for most supercomputing centers. Data centers, once built, are capped in terms of the amount of power they can use without expensive and time-consuming retrofits. Maximizing insight in the form of workload throughput then means maximizing workload per watt. NVIDIA products have, for several generations��

Source

]]>
0
Shruthii Sathyanarayanan <![CDATA[Optimizing Production AI Performance and Efficiency with NVIDIA AI Enterprise 3.0]]> http://www.open-lab.net/blog/?p=61145 2023-02-25T00:31:02Z 2023-02-22T18:30:00Z NVIDIA AI Enterprise is an end-to-end, secure, cloud-native suite of AI software. The recent release of NVIDIA AI Enterprise 3.0 introduces new features to help...]]> NVIDIA AI Enterprise is an end-to-end, secure, cloud-native suite of AI software. The recent release of NVIDIA AI Enterprise 3.0 introduces new features to help...Monitor, data, train, deploy graphic

NVIDIA AI Enterprise is an end-to-end, secure, cloud-native suite of AI software. The recent release of NVIDIA AI Enterprise 3.0 introduces new features to help optimize the performance and efficiency of production AI. This post provides details about the new features listed below and how they work. New AI workflows in the 3.0 release of NVIDIA AI Enterprise help reduce the��

Source

]]>
0
Pak Markthub <![CDATA[Improving Network Performance of HPC Systems Using NVIDIA Magnum IO NVSHMEM and GPUDirect Async]]> http://www.open-lab.net/blog/?p=57629 2022-12-01T19:52:29Z 2022-11-22T17:00:00Z Today��s leading-edge high performance computing (HPC) systems contain tens of thousands of GPUs. In NVIDIA systems, GPUs are connected on nodes through the...]]> Today��s leading-edge high performance computing (HPC) systems contain tens of thousands of GPUs. In NVIDIA systems, GPUs are connected on nodes through the...

Today��s leading-edge high performance computing (HPC) systems contain tens of thousands of GPUs. In NVIDIA systems, GPUs are connected on nodes through the NVLink scale-up interconnect, and across nodes through a scale-out network like InfiniBand. The software libraries that GPUs use to communicate, share work, and efficiently operate in parallel are collectively called NVIDIA Magnum IO��

Source

]]>
2
Stefan Maintz <![CDATA[Scaling VASP with NVIDIA Magnum IO]]> http://www.open-lab.net/blog/?p=57394 2023-04-17T02:20:16Z 2022-11-15T21:42:07Z You could make an argument that the history of civilization and technological advancement is the history of the search and discovery of materials. Ages are...]]> You could make an argument that the history of civilization and technological advancement is the history of the search and discovery of materials. Ages are...

You could make an argument that the history of civilization and technological advancement is the history of the search and discovery of materials. Ages are named not for leaders or civilizations but for the materials that defined them: Stone Age, Bronze Age, and so on. The current digital or information age could be renamed the Silicon or Semiconductor Age and retain the same meaning.

Source

]]>
1
Karthik Mandakolathur <![CDATA[Doubling all2all Performance with NVIDIA Collective Communication Library 2.12]]> http://www.open-lab.net/blog/?p=44338 2024-08-16T14:36:30Z 2022-02-28T17:00:00Z Collective communications are a performance-critical ingredient of modern distributed AI training workloads such as recommender systems and natural language...]]> Collective communications are a performance-critical ingredient of modern distributed AI training workloads such as recommender systems and natural language...

Collective communications are a performance-critical ingredient of modern distributed AI training workloads such as recommender systems and natural language processing. NVIDIA Collective Communication Library (NCCL), a Magnum IO Library, implements GPU-accelerated collective operations: NCCL is topology-aware and is optimized to achieve high bandwidth and low latency over PCIe��

Source

]]>
0
CJ Newburn <![CDATA[Accelerating IO in the Modern Data Center: Magnum IO Storage Partnerships]]> http://www.open-lab.net/blog/?p=39968 2023-03-22T01:16:55Z 2021-11-09T09:30:00Z With computation shifting from the CPU to faster GPUs for AI, ML and HPC applications, IO into and out of the GPU can become the primary bottleneck to the...]]> With computation shifting from the CPU to faster GPUs for AI, ML and HPC applications, IO into and out of the GPU can become the primary bottleneck to the...

With computation shifting from the CPU to faster GPUs for AI, ML and HPC applications, IO into and out of the GPU can become the primary bottleneck to the overall application performance. NVIDIA created Magnum IO GPUDirect Storage (GDS) to streamline data movement between storage and GPU memory and remove performance bottlenecks in the platform, like being forced to store and forward data��

Source

]]>
1
Scot Schultz <![CDATA[Accelerating Cloud-Native Supercomputing with Magnum IO]]> http://www.open-lab.net/blog/?p=40232 2023-03-22T01:16:54Z 2021-11-09T09:30:00Z Supercomputers are significant investments. However they are extremely valuable tools for researchers and scientists. To effectively and securely share the...]]> Supercomputers are significant investments. However they are extremely valuable tools for researchers and scientists. To effectively and securely share the...

Supercomputers are significant investments. However they are extremely valuable tools for researchers and scientists. To effectively and securely share the computational might of these data centers, NVIDIA introduced the Cloud-Native Supercomputing architecture. It combines bare metal performance, multitenancy, and performance isolation for supercomputing. Magnum IO, the I/

Source

]]>
2
CJ Newburn <![CDATA[Accelerating IO in the Modern Data Center: Magnum IO Storage]]> http://www.open-lab.net/blog/?p=35783 2022-08-21T23:52:25Z 2021-08-23T18:02:00Z This is the fourth post in the Accelerating IO series. It addresses storage issues and shares recent results and directions with our partners. We cover the new...]]> This is the fourth post in the Accelerating IO series. It addresses storage issues and shares recent results and directions with our partners. We cover the new...

This is the fourth post in the Accelerating IO series. It addresses storage issues and shares recent results and directions with our partners. We cover the new GPUDirect Storage release, benefits, and implementation. Accelerated computing needs accelerated IO. Otherwise, computing resources get starved for data. Given that the fraction of all workflows for which data fits in memory is��

Source

]]>
1
Kushal Datta <![CDATA[Optimizing Data Movement in GPU Applications with the NVIDIA Magnum IO Developer Environment]]> http://www.open-lab.net/blog/?p=30198 2023-03-22T01:12:00Z 2021-04-12T17:00:00Z Magnum IO is the collection of IO technologies from NVIDIA and Mellanox that make up the IO subsystem of the modern data center and enable applications at...]]> Magnum IO is the collection of IO technologies from NVIDIA and Mellanox that make up the IO subsystem of the modern data center and enable applications at...

Magnum IO is the collection of IO technologies from NVIDIA and Mellanox that make up the IO subsystem of the modern data center and enable applications at scale. If you are trying to scale up your application to multiple GPUs, or scaling it out across multiple nodes, you are probably using some of the libraries in Magnum IO. NVIDIA is now publishing the Magnum IO Developer Environment 21.04��

Source

]]>
0
CJ Newburn <![CDATA[Accelerating IO in the Modern Data Center: Network IO]]> http://www.open-lab.net/blog/?p=21733 2022-08-21T23:40:44Z 2020-10-20T19:13:11Z This is the second post in the Accelerating IO series, which describes the architecture, components, and benefits of Magnum IO, the IO subsystem of the modern...]]> This is the second post in the Accelerating IO series, which describes the architecture, components, and benefits of Magnum IO, the IO subsystem of the modern...

This is the second post in the Accelerating IO series, which describes the architecture, components, and benefits of Magnum IO, the IO subsystem of the modern data center. The first post in this series introduced the Magnum IO architecture and positioned it in the broader context of CUDA, CUDA-X, and vertical application domains. Of the four major components of the architecture��

Source

]]>
1
CJ Newburn <![CDATA[Accelerating IO in the Modern Data Center: Magnum IO Architecture]]> http://www.open-lab.net/blog/?p=21121 2023-03-22T01:09:09Z 2020-10-05T13:00:00Z This is the first post in the Accelerating IO series, which describes the architecture, components, storage, and benefits of Magnum IO, the IO subsystem of the...]]> This is the first post in the Accelerating IO series, which describes the architecture, components, storage, and benefits of Magnum IO, the IO subsystem of the...

This is the first post in the Accelerating IO series, which describes the architecture, components, storage, and benefits of Magnum IO, the IO subsystem of the modern data center. Previously the boundary of the unit of computing, sheet metal no longer constrains the resources that can be applied to a single problem or the data set that can be housed. The new unit is the data center.

Source

]]>
3
���˳���97caoporen����