Massimiliano Fatica – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2023-07-05T19:43:37Z http://www.open-lab.net/blog/feed/ Massimiliano Fatica <![CDATA[Customize CUDA Fortran Profiling with NVTX]]> http://www.open-lab.net/blog/parallelforall/?p=5951 2022-08-21T23:37:38Z 2015-09-30T01:53:40Z The NVIDIA Tools Extension (NVTX) library lets developers annotate custom events and ranges within the profiling timelines generated using tools such as the...]]>

The NVIDIA Tools Extension (NVTX) library lets developers annotate custom events and ranges within the profiling timelines generated using tools such as the NVIDIA Visual Profiler (NVVP) and NSight. In my own optimization work, I rely heavily on NVTX to better understand internal as well as customer codes and to spot opportunities for better interaction between the CPU and the GPU.

Source

]]>
4
Massimiliano Fatica <![CDATA[Optimizing the High Performance Conjugate Gradient Benchmark on GPUs]]> http://www.open-lab.net/blog/parallelforall/?p=3357 2023-07-05T19:43:37Z 2014-10-23T19:01:07Z [This post was co-written by Everett Phillips and Massimiliano Fatica.] The High Performance Conjugate Gradient Benchmark (HPCG) is a new benchmark intended to...]]>

[This post was co-written by Everett Phillips and Massimiliano Fatica.] The High Performance Conjugate Gradient Benchmark (HPCG) is a new benchmark intended to complement the High-Performance Linpack (HPL) benchmark currently used to rank supercomputers in the TOP500 list. This new benchmark solves a large sparse linear system using a multigrid preconditioned conjugate gradient (PCG) algorithm.

Source

]]>
9
���˳���97caoporen����