HPC SDK – NVIDIA Technical Blog

HPC SDK – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-21T20:30:26Z http://www.open-lab.net/blog/feed/ Graham Lopez <![CDATA[Just Released: NVIDIA HPC SDK v24.11]]> http://www.open-lab.net/blog/?p=91930 2024-11-14T17:10:32Z 2024-11-14T15:11:24Z

The new release includes several enhancements to the Math Libraries and improvements for C++ programming.]]>

The new release includes several enhancements to the Math Libraries and improvements for C++ programming.

HPC SDK

The new release includes several enhancements to the Math Libraries and improvements for C++ programming.

]]> 0 Graham Lopez <![CDATA[Just Released: NVIDIA HPC SDK v24.9]]> http://www.open-lab.net/blog/?p=89509 2024-10-17T19:07:11Z 2024-09-27T15:44:55Z

?The new release includes several new features including improved stdpar programming and Arm processor support.]]>

?The new release includes several new features including improved stdpar programming and Arm processor support.

NVIDIA HPC SDK Release

The new release includes several new features including improved stdpar programming and Arm processor support.

]]> 0 Houston Hoffman <![CDATA[Constant Time Launch for Straight-Line CUDA Graphs and Other Performance Enhancements]]> http://www.open-lab.net/blog/?p=88631 2024-09-19T19:32:10Z 2024-09-11T16:00:00Z

CUDA Graphs are a way to define and batch GPU operations as a graph rather than a sequence of stream launches. A CUDA Graph groups a set of CUDA kernels and...]]>

CUDA Graphs are a way to define and batch GPU operations as a graph rather than a sequence of stream launches. A CUDA Graph groups a set of CUDA kernels and... Decorative image of light fields in green, purple, and blue.

Decorative image of light fields in green, purple, and blue.

CUDA Graphs are a way to define and batch GPU operations as a graph rather than a sequence of stream launches. A CUDA Graph groups a set of CUDA kernels and other CUDA operations together and executes them with a specified dependency tree. It speeds up the workflow by combining the driver activities associated with CUDA kernel launches and CUDA API calls. It also enforces the dependencies with��

]]> 1 Ioana Boier <![CDATA[Profit and Loss Modeling on GPUs with ISO C++ Language Parallelism]]> http://www.open-lab.net/blog/?p=85106 2024-08-22T18:25:37Z 2024-08-07T16:30:00Z

The previous post How to Accelerate Quantitative Finance with ISO C++ Standard Parallelism demonstrated how to write a Black-Scholes simulation using ISO C++...]]>

The previous post How to Accelerate Quantitative Finance with ISO C++ Standard Parallelism demonstrated how to write a Black-Scholes simulation using ISO C++... Decorative image of a profit/loss graph.

Decorative image of a profit/loss graph.

The previous post How to Accelerate Quantitative Finance with ISO C++ Standard Parallelism demonstrated how to write a Black-Scholes simulation using ISO C++ standard parallelism with the code found in the /NVIDIA/accelerated-quant-finance GitHub repo. This approach enables you to productively write code that is both concise and portable. Using solely standard C++, it��s possible to write an��

]]> 0 Jay Gould <![CDATA[Just Released: NVIDIA HPC SDK v24.7]]> http://www.open-lab.net/blog/?p=85774 2024-08-22T18:25:46Z 2024-08-01T16:33:42Z

The new release delivers support for Ubuntu 24.04, new Fortran interfaces for CUDA Graphs, and a major version NVSHMEM API update. It is the last release to...]]>

The new release delivers support for Ubuntu 24.04, new Fortran interfaces for CUDA Graphs, and a major version NVSHMEM API update. It is the last release to...

NVIDIA HPC SDK Release

The new release delivers support for Ubuntu 24.04, new Fortran interfaces for CUDA Graphs, and a major version NVSHMEM API update. It is the last release to support RHEL 7.

]]> 0 Jay Gould <![CDATA[Just Released: NVIDIA HPC SDK 24.5]]> http://www.open-lab.net/blog/?p=82822 2024-05-30T19:55:48Z 2024-05-22T19:30:00Z

NVIDIA HPC SDK 24.5 updates include support for new NVPL components and CUDA 12.4.]]>

NVIDIA HPC SDK 24.5 updates include support for new NVPL components and CUDA 12.4.

hpc-compilation

NVIDIA HPC SDK 24.5 updates include support for new NVPL components and CUDA 12.4.

]]> 0 Paul Graham <![CDATA[Efficient CUDA Debugging: Using NVIDIA Compute Sanitizer with NVIDIA Tools Extension and Creating Custom Tools]]> http://www.open-lab.net/blog/?p=80383 2024-08-28T17:30:34Z 2024-03-27T20:29:15Z

NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications....]]>

NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications.... Decorative image of bugs crawling over a computer chip.

Decorative image of bugs crawling over a computer chip.

]]> 1 Robert Jensen <![CDATA[Building High-Performance Applications in the Era of Accelerated Computing]]> http://www.open-lab.net/blog/?p=80067 2024-08-28T17:32:20Z 2024-03-25T16:00:00Z

AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements...]]>

AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements... Illustration representing HPC.

Illustration representing HPC.

AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements of these new AI workloads, HPC is scaling up at a rapid pace. To enable applications to scale to multi-GPU and multi-node platforms, HPC tools and libraries must support that growth. NVIDIA provides a comprehensive ecosystem of��

]]> 0 Jay Gould <![CDATA[Just Released: NVIDIA HPC SDK v24.1]]> http://www.open-lab.net/blog/?p=77283 2024-02-22T19:59:05Z 2024-02-01T16:36:12Z

This NVIDIA HPC SDK update includes the cuBLASMp preview library, along with minor bug fixes and enhancements.]]>

This NVIDIA HPC SDK update includes the cuBLASMp preview library, along with minor bug fixes and enhancements. Illustration representing HPC.

Illustration representing HPC.

This NVIDIA HPC SDK update includes the cuBLASMp preview library, along with minor bug fixes and enhancements.

]]> 0 Michelle Horton <![CDATA[Webinar: Quantum ESPRESSO on GPUs: Porting Strategy and Results]]> http://www.open-lab.net/blog/?p=76593 2024-02-08T18:52:02Z 2024-01-18T18:00:00Z

Explore the status of Quantum ESPRESSO porting strategies that enable state-of-the-art performance on HPC systems.]]>

Explore the status of Quantum ESPRESSO porting strategies that enable state-of-the-art performance on HPC systems. Decorative image of two block matrices with connections against a shadowed background.

Decorative image of two block matrices with connections against a shadowed background.

Explore the status of Quantum ESPRESSO porting strategies that enable state-of-the-art performance on HPC systems.

]]> 0 Tanya Lenz <![CDATA[Webinar: Analysis of OpenACC Validation and Verification Testsuite]]> http://www.open-lab.net/blog/?p=74475 2023-12-14T19:29:46Z 2023-12-01T21:00:00Z

On December 7, learn how to verify OpenACC implementations across compilers and system architectures with the validation testsuite.]]>

On December 7, learn how to verify OpenACC implementations across compilers and system architectures with the validation testsuite.

man-laptop-webinar

On December 7, learn how to verify OpenACC implementations across compilers and system architectures with the validation testsuite.

]]> 0 Graham Lopez <![CDATA[Unlock the Power of NVIDIA Grace and NVIDIA Hopper Architectures with Foundational HPC Software]]> http://www.open-lab.net/blog/?p=72977 2024-08-28T17:33:20Z 2023-11-16T19:07:51Z

High-performance computing (HPC) powers applications in simulation and modeling, healthcare and life sciences, industry and engineering, and more. In the modern...]]>

High-performance computing (HPC) powers applications in simulation and modeling, healthcare and life sciences, industry and engineering, and more. In the modern... An illustration representing HPC applications.

An illustration representing HPC applications.

High-performance computing (HPC) powers applications in simulation and modeling, healthcare and life sciences, industry and engineering, and more. In the modern data center, HPC synergizes with AI, harnessing data in transformative new ways. The performance and throughput demands of next-generation HPC applications call for an accelerated computing platform that can handle diverse workloads��

]]> 0 Graham Lopez <![CDATA[Simplifying GPU Programming for HPC with NVIDIA Grace Hopper Superchip]]> http://www.open-lab.net/blog/?p=72720 2023-11-16T19:16:39Z 2023-11-13T17:13:02Z

The new hardware developments in NVIDIA Grace Hopper Superchip systems enable some dramatic changes to the way developers approach GPU programming. Most...]]>

The new hardware developments in NVIDIA Grace Hopper Superchip systems enable some dramatic changes to the way developers approach GPU programming. Most...

nvidia-grace-hopper

The new hardware developments in NVIDIA Grace Hopper Superchip systems enable some dramatic changes to the way developers approach GPU programming. Most notably, the bidirectional, high-bandwidth, and cache-coherent connection between CPU and GPU memory means that the user can develop their application for both processors while using a single, unified address space.

]]> 1 Tanya Lenz <![CDATA[Just Released: NVIDIA HPC SDK 23.9]]> http://www.open-lab.net/blog/?p=71163 2023-11-02T18:14:44Z 2023-10-05T20:00:00Z

This NVIDIA HPC SDK 23.9 update expands platform support and provides minor updates.]]>

This NVIDIA HPC SDK 23.9 update expands platform support and provides minor updates.

networking-infiniband-dpu-for-hpc

This NVIDIA HPC SDK 23.9 update expands platform support and provides minor updates.

]]> 0 Jay Gould <![CDATA[Just Released: NVIDIA HPC SDK v23.7]]> http://www.open-lab.net/blog/?p=68650 2024-08-28T17:38:15Z 2023-07-31T19:00:00Z

NVIDIA HPC SDK version 23.7 is now available and provides minor updates and enhancements.]]>

NVIDIA HPC SDK version 23.7 is now available and provides minor updates and enhancements. Abstract image with three different illustrations representing HPC applications.

Abstract image with three different illustrations representing HPC applications.

NVIDIA HPC SDK version 23.7 is now available and provides minor updates and enhancements.

]]> 0 Paul Graham <![CDATA[Efficient CUDA Debugging: How to Hunt Bugs with NVIDIA Compute Sanitizer]]> http://www.open-lab.net/blog/?p=66915 2024-03-21T22:32:29Z 2023-06-29T18:21:00Z

Debugging code is a crucial aspect of software development but can be both challenging and time-consuming. Parallel programming with thousands of threads can...]]>

Debugging code is a crucial aspect of software development but can be both challenging and time-consuming. Parallel programming with thousands of threads can... Stylized image of a beetle on lines of code.

Stylized image of a beetle on lines of code.

]]> 7 Jay Gould <![CDATA[Just Released: NVIDIA HPC SDK v23.5]]> http://www.open-lab.net/blog/?p=65459 2023-06-09T20:20:37Z 2023-05-25T19:00:00Z

This update expands platform support and provides minor updates.]]>

This update expands platform support and provides minor updates. Abstract image.

Abstract image.

This update expands platform support and provides minor updates.

]]> 0 Bhoomi Gadhia <![CDATA[Just Released: NVIDIA PhysicsNeMo v23.05]]> http://www.open-lab.net/blog/?p=63700 2023-06-13T17:11:43Z 2023-05-09T16:00:00Z

This version 23.05 update to the NVIDIA PhysicsNeMo platform expands support for physics-ML and provides minor updates.]]>

This version 23.05 update to the NVIDIA PhysicsNeMo platform expands support for physics-ML and provides minor updates. Image of wind turbines over water with moonlight and visible wind movement.

Image of wind turbines over water with moonlight and visible wind movement.

This version 23.05 update to the NVIDIA PhysicsNeMo platform expands support for physics-ML and provides minor updates.

]]> 0 Jay Gould <![CDATA[Just Released: NVIDIA HPC SDK v23.3]]> http://www.open-lab.net/blog/?p=62843 2023-06-09T22:32:35Z 2023-04-03T17:15:35Z

Version 23.3 expands platform support and provides minor updates to the NVIDIA HPC SDK.]]>

Version 23.3 expands platform support and provides minor updates to the NVIDIA HPC SDK. Abstract image.

Abstract image.

Version 23.3 expands platform support and provides minor updates to the NVIDIA HPC SDK.

]]> 0 Jay Gould <![CDATA[New Asynchronous Programming Model Library Now Available with NVIDIA HPC SDK v22.11]]> http://www.open-lab.net/blog/?p=57499 2023-05-24T00:18:31Z 2022-11-17T15:00:00Z

Celebrating the SuperComputing 2022 international conference, NVIDIA announces the release of HPC Software Development Kit (SDK) v22.11. Members of the NVIDIA...]]>

Celebrating the SuperComputing 2022 international conference, NVIDIA announces the release of HPC Software Development Kit (SDK) v22.11. Members of the NVIDIA...

image1 (1)

Celebrating the SuperComputing 2022 international conference, NVIDIA announces the release of HPC Software Development Kit (SDK) v22.11. Members of the NVIDIA Developer Program can download the release now for free. The NVIDIA HPC SDK is a comprehensive suite of compilers, libraries, and tools for high performance computing (HPC) developers. It provides everything developers need to��

]]> 0 Jay Gould <![CDATA[Just Released: HPC SDK v22.9]]> http://www.open-lab.net/blog/?p=54598 2023-06-12T08:56:53Z 2022-10-12T19:00:00Z

This version 22.9 update to the NVIDIA HPC SDK includes fixes and minor enhancements.]]>

This version 22.9 update to the NVIDIA HPC SDK includes fixes and minor enhancements. Four panels vertically laid out each showing a simulation with a black background

Four panels vertically laid out each showing a simulation with a black background

This version 22.9 update to the NVIDIA HPC SDK includes fixes and minor enhancements.

]]> 0 John Linford <![CDATA[Accelerating NVIDIA HPC Software with SVE on AWS Graviton3]]> http://www.open-lab.net/blog/?p=54622 2023-03-22T01:21:33Z 2022-09-19T19:00:00Z

The latest NVIDIA HPC SDK update expands portability and now supports the Arm-based AWS Graviton3 processor. In this post, you learn how to enable Scalable...]]>

The latest NVIDIA HPC SDK update expands portability and now supports the Arm-based AWS Graviton3 processor. In this post, you learn how to enable Scalable...

hpc-sdk-featured

The latest NVIDIA HPC SDK update expands portability and now supports the Arm-based AWS Graviton3 processor. In this post, you learn how to enable Scalable Vector Extension (SVE) auto-vectorization with the NVIDIA compilers to maximize the performance of HPC applications running on the AWS Graviton3 CPU. The NVIDIA HPC SDK includes the proven compilers, libraries��

]]> 2 Jay Gould <![CDATA[Just Released: New Arm CPU Support and Advancements in HPC SDK 22.7]]> http://www.open-lab.net/blog/?p=50923 2022-09-09T16:10:22Z 2022-07-27T20:00:00Z

This release includes enhancements, fixes, and new support for Arm SVE, Rocky Linux OS, and Amazon EC2 C7g instances, powered by the latest generation AWS...]]>

This release includes enhancements, fixes, and new support for Arm SVE, Rocky Linux OS, and Amazon EC2 C7g instances, powered by the latest generation AWS... Four panels vertically laid out each showing a simulation with a black background

Four panels vertically laid out each showing a simulation with a black background

This release includes enhancements, fixes, and new support for Arm SVE, Rocky Linux OS, and Amazon EC2 C7g instances, powered by the latest generation AWS Graviton3 processors.

]]> 0 Ashraf Eassa <![CDATA[Fueling High-Performance Computing with Full-Stack Innovation]]> http://www.open-lab.net/blog/?p=48769 2023-07-05T19:27:52Z 2022-06-02T18:45:00Z

High-performance computing (HPC) has become the essential instrument of scientific discovery. Whether it is discovering new, life-saving drugs, battling...]]>

High-performance computing (HPC) has become the essential instrument of scientific discovery. Whether it is discovering new, life-saving drugs, battling...

Fueling High Performance Computing with Full-Stack Innovation

High-performance computing (HPC) has become the essential instrument of scientific discovery. Whether it is discovering new, life-saving drugs, battling climate change, or creating accurate simulations of our world, these solutions demand an enormous��and rapidly growing��amount of processing power. They are increasingly out of reach of traditional computing approaches.

]]> 1 Jonas Latt <![CDATA[Multi-GPU Programming with Standard Parallel C++, Part 2]]> http://www.open-lab.net/blog/?p=44906 2023-12-05T21:52:40Z 2022-04-18T23:20:23Z

It may seem natural to expect that the performance of your CPU-to-GPU port will range below that of a dedicated HPC code. After all, you are limited by the...]]>

It may seem natural to expect that the performance of your CPU-to-GPU port will range below that of a dedicated HPC code. After all, you are limited by the... Four panels vertically laid out each showing a simulation with a black background

Four panels vertically laid out each showing a simulation with a black background

It may seem natural to expect that the performance of your CPU-to-GPU port will range below that of a dedicated HPC code. After all, you are limited by the constraints of the software architecture, the established API, and the need to account for sophisticated extra features expected by the user base. Not only that, the simplistic programming model of C++ standard parallelism allows for less��

]]> 0 Jonas Latt <![CDATA[Multi-GPU Programming with Standard Parallel C++, Part 1]]> http://www.open-lab.net/blog/?p=44904 2023-12-05T21:52:55Z 2022-04-18T23:18:13Z

The difficulty of porting an application to GPUs varies from one case to another. In the best-case scenario, you can accelerate critical code sections by...]]>

The difficulty of porting an application to GPUs varies from one case to another. In the best-case scenario, you can accelerate critical code sections by... Four panels vertically laid out each showing a simulation with a black background

Four panels vertically laid out each showing a simulation with a black background

The difficulty of porting an application to GPUs varies from one case to another. In the best-case scenario, you can accelerate critical code sections by calling into an existing GPU-optimized library. This is, for example, when the building blocks of your simulation software consist of BLAS linear algebra functions, which can be accelerated using cuBLAS. This is the second post in the��

]]> 0 Michelle Horton <![CDATA[Latest Releases and Resources: NVIDIA GTC 2022]]> http://www.open-lab.net/blog/?p=45772 2025-02-25T19:38:29Z 2022-03-24T17:34:00Z

Our weekly roundup covers the most recent software updates, learning resources, events, and notable news. This week we have several software releases. Software...]]>

Our weekly roundup covers the most recent software updates, learning resources, events, and notable news. This week we have several software releases. Software...

Top-Posts-2022

Our weekly roundup covers the most recent software updates, learning resources, events, and notable news. This week we have several software releases. Software releases The NVIDIA HPC SDK is a comprehensive suite of compilers, libraries, and tools for developing accelerated HPC applications. With a breadth of flexible support options, users can create applications with a��

]]> 0 Jeff Larkin http://jefflarkin.com <![CDATA[Developing Accelerated Code with Standard Language Parallelism]]> http://www.open-lab.net/blog/?p=43006 2025-02-25T19:38:50Z 2022-01-12T17:14:46Z

The NVIDIA platform is the most mature and complete platform for accelerated computing. In this post, I address the simplest, most productive, and most portable...]]>

The NVIDIA platform is the most mature and complete platform for accelerated computing. In this post, I address the simplest, most productive, and most portable... Four panels vertically laid out each showing a simulation with a black background

Four panels vertically laid out each showing a simulation with a black background

The NVIDIA platform is the most mature and complete platform for accelerated computing. In this post, I address the simplest, most productive, and most portable approach to accelerated computing. This is the first post in the Standard Parallel Programming series, which aims to instruct developers on the advantages of using parallelism in standard languages for accelerated computing��

]]> 0 Jay Gould <![CDATA[Maximize Performance of HPC Apps with HPC SDK 21.11, Available Now]]> http://www.open-lab.net/blog/?p=42169 2022-08-21T23:53:13Z 2021-12-13T18:00:00Z

At the Supercomputing Conference (SC21) NVIDIA preannounced the next update to the HPC SDK. Today, the HPC SDK 21.11 release was posted for free download to...]]>

At the Supercomputing Conference (SC21) NVIDIA preannounced the next update to the HPC SDK. Today, the HPC SDK 21.11 release was posted for free download to...

HPC-SDK-21.11

At the Supercomputing Conference (SC21) NVIDIA preannounced the next update to the HPC SDK. Today, the HPC SDK 21.11 release was posted for free download to Developer Program members. The NVIDIA HPC SDK is a comprehensive suite of compilers and libraries for high performance computing development. It includes a wide variety of tools proven to maximize developer productivity, as well as the��

]]> 0 Jay Gould <![CDATA[Maximize Performance and Portability of HPC Apps with HPC SDK v21.11]]> http://www.open-lab.net/blog/?p=41136 2022-08-21T23:53:06Z 2021-11-16T20:00:00Z

Today, NVIDIA announced the upcoming HPC SDK 21.11 release with new Library enhancements. This software will be available free of charge in the coming weeks....]]>

Today, NVIDIA announced the upcoming HPC SDK 21.11 release with new Library enhancements. This software will be available free of charge in the coming weeks....

HPC-SDK-21.11

Today, NVIDIA announced the upcoming HPC SDK 21.11 release with new Library enhancements. This software will be available free of charge in the coming weeks. The NVIDIA HPC SDK is a comprehensive suite of compilers and libraries for high-performance computing development. It includes a wide variety of tools proven to maximize developer productivity, as well as the performance and portability��

]]> 0 Neeraj Srivastava <![CDATA[Develop the Next Generation of HPC Applications with the NVIDIA Arm HPC Developer Kit]]> http://www.open-lab.net/blog/?p=41190 2022-08-21T23:53:06Z 2021-11-15T23:30:00Z

In July of 2021, NVIDIA announced the availability of the NVIDIA Arm HPC Developer Kit for preordering, along with the NVIDIA HPC SDK. Since then NVIDIA and its...]]>

In July of 2021, NVIDIA announced the availability of the NVIDIA Arm HPC Developer Kit for preordering, along with the NVIDIA HPC SDK. Since then NVIDIA and its... Graphic of HPC SDK.

Graphic of HPC SDK.

In July of 2021, NVIDIA announced the availability of the NVIDIA Arm HPC Developer Kit for preordering, along with the NVIDIA HPC SDK. Since then NVIDIA and its partners have been working hard to get units into the hands of developers, to increase global availability, and enhance the software stack. The NVIDIA Arm HPC Developer Kit is based on the GIGABYTE G242-P32 2U server.

]]> 0 Jay Gould <![CDATA[NVIDIA Announces Availability for Arm HPC Developer Kit with New HPC SDK v21.7]]> http://www.open-lab.net/blog/?p=34947 2022-08-21T23:52:18Z 2021-07-22T21:18:24Z

Today NVIDIA announced the availability of the NVIDIA Arm HPC Developer Kit with the NVIDIA HPC SDK version 21.7. The DevKit is an integrated...]]>

Today NVIDIA announced the availability of the NVIDIA Arm HPC Developer Kit with the NVIDIA HPC SDK version 21.7. The DevKit is an integrated... Graphic of HPC SDK.

Graphic of HPC SDK.

Today NVIDIA announced the availability of the NVIDIA Arm HPC Developer Kit with the NVIDIA HPC SDK version 21.7. The DevKit is an integrated hardware-software platform for creating, evaluating, and benchmarking HPC, AI, and scientific computing applications for Arm server based accelerated platforms. The HPC SDK v21.7 is the latest update of the software development kit, and fully supports the��

]]> 5 Greg Ruetsch <![CDATA[Using Tensor Cores in CUDA Fortran]]> http://www.open-lab.net/blog/?p=24627 2023-03-22T01:11:50Z 2021-04-15T21:00:20Z

Tensor Cores, which are programmable matrix multiply and accumulate units, were first introduced in the V100 GPUs where they operated on half-precision (16-bit)...]]>

Tensor Cores, which are programmable matrix multiply and accumulate units, were first introduced in the V100 GPUs where they operated on half-precision (16-bit)...

CUDA_Tensor_Featured_image

Tensor Cores, which are programmable matrix multiply and accumulate units, were first introduced in the V100 GPUs where they operated on half-precision (16-bit) multiplicands. Tensor Core functionality has been expanded in the following architectures, and in the Ampere A100 GPUs (compute capability 8.0) support for other data types was added, including double precision.

]]> 1 Dhruv Singal <![CDATA[N Ways to SAXPY: Demonstrating the Breadth of GPU Programming Options]]> http://www.open-lab.net/blog/?p=25483 2023-02-13T17:23:38Z 2021-04-06T21:11:00Z

Back in 2012, NVIDIAN Mark Harris wrote Six Ways to Saxpy, demonstrating how to perform the SAXPY operation on a GPU in multiple ways, using different languages...]]>

Back in 2012, NVIDIAN Mark Harris wrote Six Ways to Saxpy, demonstrating how to perform the SAXPY operation on a GPU in multiple ways, using different languages...

SAXPY

Back in 2012, NVIDIAN Mark Harris wrote Six Ways to Saxpy, demonstrating how to perform the SAXPY operation on a GPU in multiple ways, using different languages and libraries. Since then, programming paradigms have evolved and so has the NVIDIA HPC SDK. In this post, I demonstrate five ways to implement a simple SAXPY computation using NVIDIA GPUs. Why is this interesting?

]]> 1 Michael Wolfe <![CDATA[Detecting Divergence Using PCAST to Compare GPU to CPU Results]]> http://www.open-lab.net/blog/?p=22165 2022-08-21T23:40:47Z 2020-11-18T16:00:00Z

Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. The first...]]>

Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. The first... PCAST helps to quickly isolate divergence between CPU and GPU results so you can isolate bugs or verify your results are OK even if they aren��t identical.

PCAST helps to quickly isolate divergence between CPU and GPU results so you can isolate bugs or verify your results are OK even if they aren��t identical.

Parallel Compiler Assisted Software Testing (PCAST) is a feature available in the NVIDIA HPC Fortran, C++, and C compilers. PCAST has two use cases. The first is testing changes to parts of a program, new compile-time flags, or a port to a new compiler or to a new processor. You might want to test whether a new library gives the same result, or test the safety of adding OpenMP parallelism��

]]> 0 Wayne Gaudin <![CDATA[Building and Deploying HPC Applications using NVIDIA HPC SDK from the NVIDIA NGC Catalog]]> http://www.open-lab.net/blog/?p=22228 2022-08-21T23:40:47Z 2020-11-16T16:00:42Z

HPC development environments are typically complex configurations composed of multiple software packages, each providing unique capabilities. In addition to the...]]>

HPC development environments are typically complex configurations composed of multiple software packages, each providing unique capabilities. In addition to the...

decorative

HPC development environments are typically complex configurations composed of multiple software packages, each providing unique capabilities. In addition to the core set of compilers used for building software from source code, they often include a number of specialty packages covering a broad range of operations such as communications, data structures, mathematics, I/O control��

]]> 0 Guray Ozen <![CDATA[Accelerating Fortran DO CONCURRENT with GPUs and the NVIDIA HPC SDK]]> http://www.open-lab.net/blog/?p=22198 2023-06-12T21:13:52Z 2020-11-16T16:00:00Z

Fortran developers have long been able to accelerate their programs using CUDA Fortran or OpenACC. For more up-to-date information, please read Using Fortran...]]>

Fortran developers have long been able to accelerate their programs using CUDA Fortran or OpenACC. For more up-to-date information, please read Using Fortran...

Fortran Featured

Fortran developers have long been able to accelerate their programs using CUDA Fortran or OpenACC. For more up-to-date information, please read Using Fortran Standard Parallel Programming for GPU Acceleration, which aims to instruct developers on the advantages of using parallelism in standard languages for accelerated computing. Now with the latest 20.11 release of the NVIDIA HPC SDK��

]]> 28 Ashwin Srinath <![CDATA[Accelerating Python on GPUs with nvc++ and Cython]]> http://www.open-lab.net/blog/?p=21995 2022-08-21T23:40:46Z 2020-11-10T17:53:47Z

The C++ standard library contains a rich collection of containers, iterators, and algorithms that can be composed to produce elegant solutions to complex...]]>

The C++ standard library contains a rich collection of containers, iterators, and algorithms that can be composed to produce elegant solutions to complex...

cython-cplusplus-logos-crop

The C++ standard library contains a rich collection of containers, iterators, and algorithms that can be composed to produce elegant solutions to complex problems. Most importantly, they are fast, making C++ an attractive choice for writing highly performant code. NVIDIA recently introduced stdpar: a way to automatically accelerate the execution of C++ standard library algorithms on GPUs��

]]> 9 David Olsen <![CDATA[Accelerating Standard C++ with GPUs Using stdpar]]> http://www.open-lab.net/blog/?p=18511 2023-12-05T23:58:18Z 2020-08-04T23:30:00Z

Historically, accelerating your C++ code with GPUs has not been possible in Standard C++ without using language extensions or additional libraries: CUDA C++...]]>

Historically, accelerating your C++ code with GPUs has not been possible in Standard C++ without using language extensions or additional libraries: CUDA C++... Standard Parallellism in C++

Standard Parallellism in C++

Historically, accelerating your C++ code with GPUs has not been possible in Standard C++ without using language extensions or additional libraries: In many cases, the results of these ports are worth the effort. But what if you could get the same effect without that cost? What if you could take your Standard C++ code and accelerate on a GPU? Now you can!

]]> 7 ��˳��97caoporen��