In part 1, I introduced the code for profiling, covered the basic ideas of analysis-driven optimization (ADO), and got you started with the Nsight Compute profiler. In part 2, you apply what you learned to improve the performance of the code and then continue the analysis and optimization process. To refactor the code based on the previous analysis in part 1, you observe that your kernel…
]]>By now, hopefully you read the first two blogs in this series “Migrating to NVIDIA Nsight Tools from NVVP and Nvprof” and “Transitioning to Nsight Systems from NVIDIA Visual Profiler / nvprof,” and you’ve discovered NVIDIA added a few new tools, both Nsight Compute and Nsight Systems, to the repertoire of CUDA tools available for developers. The tools become more and more important when using…
]]>