Writing high-performance software is no simple task. After you have code that can compile and run, a new challenge is introduced when you try and understand how it is performing on the available hardware. Different platforms, whether they are CPUs, GPUs, or something else, will have different hardware limitations like available memory bandwidth and theoretical compute limits.
]]>