Many GPU-accelerated HPC applications spend a substantial portion of their time in non-uniform, GPU-to-GPU communications, resulting in an increased solution times. To ensure that GPU-to-GPU communication is as efficient as possible for HPC applications with non-uniform communication, it is crucial that these applications make informed decisions when assigning MPI processes to GPUs��
]]>