Jim Dinan – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2023-02-13T17:45:04Z http://www.open-lab.net/blog/feed/ Jim Dinan <![CDATA[Improving Network Performance of HPC Systems Using NVIDIA Magnum IO NVSHMEM and GPUDirect Async]]> http://www.open-lab.net/blog/?p=57629 2022-12-01T19:52:29Z 2022-11-22T17:00:00Z Today��s leading-edge high performance computing (HPC) systems contain tens of thousands of GPUs. In NVIDIA systems, GPUs are connected on nodes through the...]]>

Today’s leading-edge high performance computing (HPC) systems contain tens of thousands of GPUs. In NVIDIA systems, GPUs are connected on nodes through the NVLink scale-up interconnect, and across nodes through a scale-out network like InfiniBand. The software libraries that GPUs use to communicate, share work, and efficiently operate in parallel are collectively called NVIDIA Magnum IO…

Source

]]>
3
Jim Dinan <![CDATA[Accelerating NVSHMEM 2.0 Team-Based Collectives Using NCCL]]> http://www.open-lab.net/blog/?p=22803 2022-08-21T23:40:50Z 2021-01-22T21:47:31Z NVSHMEM 2.0 is introducing a new API for performing collective operations based on the Team Management feature of the OpenSHMEM 1.5 specification. A team is a...]]>

NVSHMEM 2.0 is introducing a new API for performing collective operations based on the Team Management feature of the OpenSHMEM 1.5 specification. A team is a subset of processing elements (PEs) in an OpenSHMEM job. The concept is analogous to communicators in MPI. The new Teams API is a replacement for the active-set-based API for collective operations in the OpenSHMEM specification that was…

Source

]]>
0
Jim Dinan <![CDATA[Scaling Scientific Computing with NVSHMEM]]> http://www.open-lab.net/blog/?p=18979 2023-02-13T17:45:04Z 2020-08-25T17:23:15Z Figure 1. In the NVSHMEM memory model, each process (PE) has private memory, as well as symmetric memory that forms a partition of the partitioned global...]]>

When you double the number of processors used to solve a given problem, you expect the solution time to be cut in half. However, most programmers know from experience that applications tend to reach a point of diminishing returns when increasing the number of processors being used to solve a fixed-size problem. How efficiently an application can use more processors is called parallel…

Source

]]>
1
���˳���97caoporen����