Fast Multi-GPU collectives with NCCL – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-20T16:50:52Z http://www.open-lab.net/blog/feed/ Nathan Luehr <![CDATA[Fast Multi-GPU collectives with NCCL]]> http://www.open-lab.net/blog/parallelforall/?p=6598 2022-08-21T23:37:50Z 2016-04-07T15:27:54Z Today many servers contain 8 or more GPUs. In principle then, scaling an application from one to many GPUs should provide a tremendous performance boost. But in...]]> Today many servers contain 8 or more GPUs. In principle then, scaling an application from one to many GPUs should provide a tremendous performance boost. But in...Figure 5: Ring order of GPUs in PCIe tree.

Today many servers contain 8 or more GPUs. In principle then, scaling an application from one to many GPUs should provide a tremendous performance boost. But in practice, this benefit can be difficult to obtain. There are two common culprits behind poor multi-GPU scaling. The first is that enough parallelism has not been exposed to efficiently saturate the processors. The second reason for poor��

Source

]]>
14
���˳���97caoporen����