Profilers / Debuggers / Code Analysis – NVIDIA Technical Blog

Profilers / Debuggers / Code Analysis – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-19T06:00:00Z http://www.open-lab.net/blog/feed/ Rob Van der Wijngaart <![CDATA[Improving GPU Performance by Reducing Instruction Cache Misses]]> http://www.open-lab.net/blog/?p=86868 2025-01-22T17:57:59Z 2024-08-08T16:30:00Z

GPUs are specially designed to crunch through massive amounts of data at high speed. They have a large amount of compute resources, called streaming...]]>

GPUs are specially designed to crunch through massive amounts of data at high speed. They have a large amount of compute resources, called streaming... Decorative image of light fields in green, purple, and blue.

Decorative image of light fields in green, purple, and blue.

GPUs are specially designed to crunch through massive amounts of data at high speed. They have a large amount of compute resources, called streaming multiprocessors (SMs), and an array of facilities to keep them fed with data: high bandwidth to memory, sizable data caches, and the capability to switch to other teams of workers (warps) without any overhead if an active team has run out of data.

]]> 6 Steven Gurfinkel <![CDATA[Checkpointing CUDA Applications with CRIU]]> http://www.open-lab.net/blog/?p=84236 2024-07-25T18:19:18Z 2024-07-02T16:00:00Z

Checkpoint and restore functionality for CUDA is exposed through a command-line utility called cuda-checkpoint. This utility can be used to transparently...]]>

Checkpoint and restore functionality for CUDA is exposed through a command-line utility called cuda-checkpoint. This utility can be used to transparently...

CUDA CRIU Main Image-1

]]> 1 Paul Graham <![CDATA[Efficient CUDA Debugging: Using NVIDIA Compute Sanitizer with NVIDIA Tools Extension and Creating Custom Tools]]> http://www.open-lab.net/blog/?p=80383 2024-08-28T17:30:34Z 2024-03-27T20:29:15Z

NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications....]]>

NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications.... Decorative image of bugs crawling over a computer chip.

Decorative image of bugs crawling over a computer chip.

]]> 1 Louis Bavoil <![CDATA[Powerful Shader Insights: Using Shader Debug Info with NVIDIA Nsight Graphics]]> http://www.open-lab.net/blog/?p=79026 2024-12-09T16:54:30Z 2024-03-14T20:00:00Z

As ray tracing becomes the predominant rendering technique in modern game engines, a single GPU RayGen shader can now perform most of the light simulation of a...]]>

As ray tracing becomes the predominant rendering technique in modern game engines, a single GPU RayGen shader can now perform most of the light simulation of a...

kitchen-screenshot-nvidia-path-tracing-sdk

As ray tracing becomes the predominant rendering technique in modern game engines, a single GPU RayGen shader can now perform most of the light simulation of a frame. To manage this level of complexity, it becomes necessary to observe a decomposition of shader performance at the HLSL or GLSL source-code level. As a result, shader profilers are now a must-have tool for optimizing ray tracing.

]]> 0 Mozhgan Kabiri Chimeh <![CDATA[Efficient CUDA Debugging: Memory Initialization and Thread Synchronization with NVIDIA Compute Sanitizer]]> http://www.open-lab.net/blog/?p=71925 2024-03-21T22:25:40Z 2023-10-24T16:00:00Z

NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications.? In...]]>

NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications.? In...

debugging-cuda

]]> 1 Robert Jensen <![CDATA[Speed Up GPU Crash Debugging with NVIDIA Nsight Aftermath]]> http://www.open-lab.net/blog/?p=65864 2024-08-28T18:10:13Z 2023-08-09T19:00:00Z

NVIDIA Nsight Developer Tools provide comprehensive access to NVIDIA GPUs and graphics APIs for performance analysis, optimization, and debugging activities....]]>

NVIDIA Nsight Developer Tools provide comprehensive access to NVIDIA GPUs and graphics APIs for performance analysis, optimization, and debugging activities.... A spinning GIF showing a tree on fire in the dark with other trees surrounding it.

A spinning GIF showing a tree on fire in the dark with other trees surrounding it.

NVIDIA Nsight Developer Tools provide comprehensive access to NVIDIA GPUs and graphics APIs for performance analysis, optimization, and debugging activities. When using advanced rendering techniques like ray tracing or path tracing, Nsight tools are your companion for creating a smooth and polished experience. At SIGGRAPH 2023, NVIDIA hosted a lab exploring how to use NVIDIA Nsight Tools to��

]]> 0 Rob Armstrong <![CDATA[CUDA Toolkit 12.2 Unleashes Powerful Features for Boosting Applications]]> http://www.open-lab.net/blog/?p=67705 2024-08-28T17:39:00Z 2023-07-06T19:16:56Z

The latest release of CUDA Toolkit 12.2 introduces a range of essential new features, modifications to the programming model, and enhanced support for hardware...]]>

The latest release of CUDA Toolkit 12.2 introduces a range of essential new features, modifications to the programming model, and enhanced support for hardware... CUDA abstract image.

CUDA abstract image.

The latest release of CUDA Toolkit 12.2 introduces a range of essential new features, modifications to the programming model, and enhanced support for hardware capabilities accelerating CUDA applications. Now out through general availability from NVIDIA, CUDA Toolkit 12.2 includes many new capabilities, both major and minor. The following post offers an overview of many of the key��

]]> 0 Paul Graham <![CDATA[Efficient CUDA Debugging: How to Hunt Bugs with NVIDIA Compute Sanitizer]]> http://www.open-lab.net/blog/?p=66915 2024-03-21T22:32:29Z 2023-06-29T18:21:00Z

Debugging code is a crucial aspect of software development but can be both challenging and time-consuming. Parallel programming with thousands of threads can...]]>

Debugging code is a crucial aspect of software development but can be both challenging and time-consuming. Parallel programming with thousands of threads can... Stylized image of a beetle on lines of code.

Stylized image of a beetle on lines of code.

]]> 7 Holly Wilper <![CDATA[Optimizing CUDA Memory Transfers with NVIDIA Nsight Systems]]> http://www.open-lab.net/blog/?p=67259 2024-08-28T17:39:05Z 2023-06-28T18:49:01Z

NVIDIA Nsight Systems is a comprehensive tool for tracking application performance across CPU and GPU resources. It helps ensure that hardware is being...]]>

NVIDIA Nsight Systems is a comprehensive tool for tracking application performance across CPU and GPU resources. It helps ensure that hardware is being...

nsight-optimizing-featured

NVIDIA Nsight Systems is a comprehensive tool for tracking application performance across CPU and GPU resources. It helps ensure that hardware is being efficiently used, traces API calls, and gives insight into inter-node network communication by describing how low-level metrics sum to application performance and finding where it can be improved. Nsight Systems can scale to cluster-size��

]]> 1 Jackson Marusarz <![CDATA[Improve Guidance and Performance Visualization with the New Nsight Compute]]> http://www.open-lab.net/blog/?p=48546 2024-08-28T17:45:26Z 2022-05-31T16:00:00Z

NVIDIA Nsight Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging through a user...]]>

NVIDIA Nsight Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging through a user... CUDA-X logo graphic

NVIDIA Nsight Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging through a user interface and a command-line tool. Nsight Compute 2022.2 includes features to expand the supported environments and workflows for CUDA kernel profiling and optimization. Download now. >> The following outlines the feature highlights of��

]]> 0 Ingo Esser <![CDATA[Record, Edit, and Rewind in Virtual Reality with NVIDIA VR Capture and Replay]]> http://www.open-lab.net/blog/?p=45290 2023-06-12T20:55:28Z 2022-03-24T16:00:00Z

Developers and early access users can now accurately capture and replay VR sessions for performance testing, scene troubleshooting, and more with NVIDIA Virtual...]]>

Developers and early access users can now accurately capture and replay VR sessions for performance testing, scene troubleshooting, and more with NVIDIA Virtual...

VCR-NVIDIA

Developers and early access users can now accurately capture and replay VR sessions for performance testing, scene troubleshooting, and more with NVIDIA Virtual Reality Capture and Replay (VCR.) The potentials of virtual worlds are limitless, but working with VR content poses challenges, especially when it comes to recording or recreating a virtual experience. Unlike the real world��

]]> 0 Chaitrali Joshi <![CDATA[NVIDIA Nsight Systems 2022.1 Introduces Vulkan 1.3 and Linux Backtrace Sampling and Profiling Improvements]]> http://www.open-lab.net/blog/?p=43565 2024-08-28T18:14:17Z 2022-01-25T22:07:42Z

The latest update to NVIDIA Nsight Systems��a performance analysis tool designed to help developers tune and scale software across CPUs and GPUs��is now...]]>

The latest update to NVIDIA Nsight Systems��a performance analysis tool designed to help developers tune and scale software across CPUs and GPUs��is now...

Vulkan-Nsight-Linux

The latest update to NVIDIA Nsight Systems��a performance analysis tool designed to help developers tune and scale software across CPUs and GPUs��is now available for download. Nsight Systems 2022.1 introduces several improvements aimed to enhance the profiling experience. Nsight Systems is part of the powerful debugging and profiling NVIDIA Nsight Tools Suite. A developer can start with��

]]> 0 Aurelio Reis <![CDATA[NVIDIA Nsight Graphics 2022.1 Supports Latest Vulkan Ray Tracing Extension]]> http://www.open-lab.net/blog/?p=43518 2024-08-28T18:16:12Z 2022-01-25T16:00:00Z

Today, NVIDIA announced the latest Nsight Graphics 2022.1, which supports Direct3D (11, 12, DXR), Vulkan 1.3 ray tracing extension, OpenGL, OpenVR, and the...]]>

Today, NVIDIA announced the latest Nsight Graphics 2022.1, which supports Direct3D (11, 12, DXR), Vulkan 1.3 ray tracing extension, OpenGL, OpenVR, and the... CUDA-X logo graphic

Today, NVIDIA announced the latest Nsight Graphics 2022.1, which supports Direct3D (11, 12, DXR), Vulkan 1.3 ray tracing extension, OpenGL, OpenVR, and the Oculus SDK. NVIDIA Nsight Graphics is a standalone developer tool that enables you to debug, profile, and export frames built with high-fidelity, 3D-graphic applications. Download NVIDIA Nsight Graphics now.

]]> 0 Arthy Sundaram <![CDATA[Boosting Productivity and Performance with the NVIDIA CUDA 11.2 C++ Compiler]]> http://www.open-lab.net/blog/?p=23916 2022-08-21T23:41:02Z 2021-02-13T02:30:28Z

The 11.2 CUDA C++ compiler incorporates features and enhancements aimed at improving developer productivity and the performance of GPU-accelerated applications....]]>

The 11.2 CUDA C++ compiler incorporates features and enhancements aimed at improving developer productivity and the performance of GPU-accelerated applications....

CudaC++

The 11.2 CUDA C++ compiler incorporates features and enhancements aimed at improving developer productivity and the performance of GPU-accelerated applications. The compiler toolchain gets an LLVM upgrade to 7.0, which enables new features and can help improve compiler code generation for NVIDIA GPUs. Link-time optimization (LTO) for device code (also known as device LTO)��

]]> 0 Greg Ruetsch <![CDATA[Pro Tip: Pinpointing Runtime Errors in CUDA Fortran]]> http://www.open-lab.net/blog/parallelforall/?p=8590 2022-08-21T23:38:33Z 2017-11-17T02:03:48Z

[caption id="attachment_2407" align="alignright" width="208"] CUDA Fortran for Scientists and Engineers shows how high-performance application developers can...]]>

[caption id="attachment_2407" align="alignright" width="208"] CUDA Fortran for Scientists and Engineers shows how high-performance application developers can...

CUDA Fortran for Scientists and Engineers shows how high-performance application developers can leverage the power of GPUs using Fortran.

We��ve all been there. Your CUDA Fortran code is humming along and suddenly you get a runtime error: , , usually accompanied by in all caps. In many cases, the error message gives you enough information to find where the problem is in your source code: you have a runtime error and you only perform a few host-to-device transfers, or your code ran fine before you added that block of code earlier��

]]> 2 Kudbudeen Jalaludeen <![CDATA[CUDA Development for Jetson with NVIDIA Nsight Eclipse Edition]]> http://www.open-lab.net/blog/parallelforall/?p=7632 2024-08-28T17:59:42Z 2017-03-20T03:19:11Z

[caption id="attachment_7587" align="alignright" width="300"] Figure 1: NVIDIA Jetson TX2 Developer Kit.[/caption] NVIDIA Nsight Eclipse Edition is a...]]>

[caption id="attachment_7587" align="alignright" width="300"] Figure 1: NVIDIA Jetson TX2 Developer Kit.[/caption] NVIDIA Nsight Eclipse Edition is a...

NVIDIA Nsight Eclipse Edition is a full-featured, integrated development environment that lets you easily develop CUDA applications for either your local (x86) system or a remote (x86 or Arm) target. In this post, I will walk you through the process of remote-developing CUDA applications for the NVIDIA Jetson TX2, an Arm-based development kit. Note that this how-to also applies to Jetson TX1 and��

]]> 15 Tim Dettmers <![CDATA[Deep Learning in a Nutshell: Sequence Learning]]> http://www.open-lab.net/blog/parallelforall/?p=6437 2022-08-21T23:37:46Z 2016-03-08T06:26:00Z

This series of blog posts aims to provide an intuitive and gentle introduction to deep learning that does not rely heavily on math or theoretical...]]>

This series of blog posts aims to provide an intuitive and gentle introduction to deep learning that does not rely heavily on math or theoretical... CUDA AI Cube

CUDA AI Cube

This series of blog posts aims to provide an intuitive and gentle introduction to deep learning that does not rely heavily on math or theoretical constructs. The first part of this series provided an overview of the field of deep learning, covering fundamental and core concepts. The second part of the series provided an overview of training neural networks efficiently and gave a background on the��

]]> 3 Justin Luitjens <![CDATA[CUDA Pro Tip: Always Set the Current Device to Avoid Multithreading Bugs]]> http://www.open-lab.net/blog/parallelforall/?p=3619 2022-08-21T23:37:08Z 2014-09-05T00:07:17Z

We often say that to reach?high performance on GPUs you should expose as much parallelism in your code as possible, and we don't mean just parallelism...]]>

We often say that to reach?high performance on GPUs you should expose as much parallelism in your code as possible, and we don't mean just parallelism... GPU Pro Tip

GPU Pro Tip

We often say that to reach high performance on GPUs you should expose as much parallelism in your code as possible, and we don��t mean just parallelism within one GPU, but also across multiple GPUs and CPUs. It��s common for high-performance software to parallelize across multiple GPUs by assigning one or more CPU threads to each GPU. In this post I��ll cover a common but subtle bug and a simple rule��

]]> 4 Satish Salian <![CDATA[Remote Application Development using NVIDIA Nsight Eclipse Edition]]> http://www.open-lab.net/blog/parallelforall/?p=3483 2024-08-28T18:00:05Z 2014-08-26T01:11:51Z

NVIDIA Nsight Eclipse Edition (NSEE) is a full-featured unified CPU+GPU integrated development environment(IDE) that lets you easily develop CUDA applications...]]>

NVIDIA Nsight Eclipse Edition (NSEE) is a full-featured unified CPU+GPU integrated development environment(IDE) that lets you easily develop CUDA applications...

nsight_esclipse_logo

NVIDIA Nsight Eclipse Edition (NSEE) is a full-featured unified CPU+GPU integrated development environment(IDE) that lets you easily develop CUDA applications for either your local (x86_64) system or a remote (x86_64 or ARM) target system. In my last post on remote development of CUDA applications, I covered NSEE��s cross compilation mode. In this post I will focus on the using NSEE��s synchronized��

]]> 65 Satish Salian <![CDATA[NVIDIA Nsight Eclipse Edition for Jetson TK1]]> http://www.open-lab.net/blog/parallelforall/?p=3255 2024-08-28T18:00:33Z 2014-05-27T17:50:42Z

NVIDIA Nsight Eclipse Edition is a full-featured, integrated development environment that lets you easily develop CUDA applications for either your local (x86)...]]>

NVIDIA Nsight Eclipse Edition is a full-featured, integrated development environment that lets you easily develop CUDA applications for either your local (x86)...

NVIDIA Nsight Eclipse Edition is a full-featured, integrated development environment that lets you easily develop CUDA applications for either your local (x86) system or a remote (x86 or Arm) target. In this post, I will walk you through the process of remote-developing CUDA applications for the NVIDIA Jetson TK1, an Arm-based development kit. Nsight supports two remote development modes: cross��

]]> 103 Mark Ebersole http://www.open-lab.net/blog/parallelforall <![CDATA[CUDACasts Episode #9: Explore GPU device memory with Nsight Eclipse Edition]]> http://www.open-lab.net/blog/parallelforall/?p=2021 2024-08-28T18:00:38Z 2013-09-10T02:00:42Z

Visual tools offer a very efficient method for developing and debugging applications. When working on massively parallel codes built on the CUDA Platform, this...]]>

Visual tools offer a very efficient method for developing and debugging applications. When working on massively parallel codes built on the CUDA Platform, this...

CUDACasts_FeaturedImage

Visual tools offer a very efficient method for developing and debugging applications. When working on massively parallel codes built on the CUDA Platform, this visual approach is even more important because you could be dealing with tens of thousands of parallel threads. With the free NVIDIA Nsight Eclipse Edition IDE, you can quickly and easily examine the GPU memory state in a running CUDA C��

]]> 0 ��˳��97caoporen��