Gonzalo Brito – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2023-12-05T21:52:55Z http://www.open-lab.net/blog/feed/ Gonzalo Brito <![CDATA[Simplifying GPU Application Development with Heterogeneous Memory Management]]> http://www.open-lab.net/blog/?p=69542 2023-09-13T17:07:34Z 2023-08-22T17:00:00Z Heterogeneous Memory Management (HMM) is a CUDA memory management feature that extends the simplicity and productivity of the CUDA Unified Memory programming...]]>

Source

]]>
0
Gonzalo Brito <![CDATA[NVIDIA Grace Hopper Superchip Architecture In-Depth]]> http://www.open-lab.net/blog/?p=57192 2022-11-18T11:48:05Z 2022-11-10T19:00:00Z The NVIDIA Grace Hopper Superchip Architecture is the first true heterogeneous accelerated platform for high-performance computing (HPC) and AI workloads. It...]]>

The NVIDIA Grace Hopper Superchip Architecture is the first true heterogeneous accelerated platform for high-performance computing (HPC) and AI workloads. It accelerates applications with the strengths of both GPUs and CPUs while providing the simplest and most productive distributed heterogeneous programming model to date. Scientists and engineers can focus on solving the world’s most important…

Source

]]>
11
Gonzalo Brito <![CDATA[Multi-GPU Programming with Standard Parallel C++, Part 2]]> http://www.open-lab.net/blog/?p=44906 2023-12-05T21:52:40Z 2022-04-18T23:20:23Z It may seem natural to expect that the performance of your CPU-to-GPU port will range below that of a dedicated HPC code. After all, you are limited by the...]]>

It may seem natural to expect that the performance of your CPU-to-GPU port will range below that of a dedicated HPC code. After all, you are limited by the constraints of the software architecture, the established API, and the need to account for sophisticated extra features expected by the user base. Not only that, the simplistic programming model of C++ standard parallelism allows for less…

Source

]]>
0
Gonzalo Brito <![CDATA[Multi-GPU Programming with Standard Parallel C++, Part 1]]> http://www.open-lab.net/blog/?p=44904 2023-12-05T21:52:55Z 2022-04-18T23:18:13Z The difficulty of porting an application to GPUs varies from one case to another. In the best-case scenario, you can accelerate critical code sections by...]]>

The difficulty of porting an application to GPUs varies from one case to another. In the best-case scenario, you can accelerate critical code sections by calling into an existing GPU-optimized library. This is, for example, when the building blocks of your simulation software consist of BLAS linear algebra functions, which can be accelerated using cuBLAS. This is the second post in the…

Source

]]>
0
Gonzalo Brito <![CDATA[NVIDIA Hopper Architecture In-Depth]]> http://www.open-lab.net/blog/?p=45555 2023-10-25T23:51:26Z 2022-03-22T18:00:00Z Today during the 2022 NVIDIA GTC Keynote address, NVIDIA CEO Jensen Huang introduced the new NVIDIA H100 Tensor Core GPU based on the new NVIDIA Hopper GPU...]]>

Today during the 2022 NVIDIA GTC Keynote address, NVIDIA CEO Jensen Huang introduced the new NVIDIA H100 Tensor Core GPU based on the new NVIDIA Hopper GPU architecture. This post gives you a look inside the new H100 GPU and describes important new features of NVIDIA Hopper architecture GPUs. The NVIDIA H100 Tensor Core GPU is our ninth-generation data center GPU designed to deliver an…

Source

]]>
2
���˳���97caoporen����