Jake Hemstad – NVIDIA Technical Blog

Jake Hemstad – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2023-05-23T23:50:12Z http://www.open-lab.net/blog/feed/ Jake Hemstad <![CDATA[Maximizing Performance with Massively Parallel Hash Maps on GPUs]]> http://www.open-lab.net/blog/?p=61480 2023-05-23T23:50:12Z 2023-03-06T17:30:00Z

Decades of computer science history have been devoted to devising solutions for efficient storage and retrieval of information. Hash maps (or hash tables) are a...]]>

Decades of computer science history have been devoted to devising solutions for efficient storage and retrieval of information. Hash maps (or hash tables) are a popular data structure for information storage given their amortized, constant-time guarantees for the insertion and retrieval of elements. However, despite their prevalence, hash maps are seldom discussed in the context of GPU…

]]> 1 Jake Hemstad <![CDATA[Using the NVIDIA CUDA Stream-Ordered Memory Allocator, Part 2]]> http://www.open-lab.net/blog/?p=35152 2022-08-21T23:52:21Z 2021-07-27T20:47:33Z

In part 1 of this series, we introduced new API functions, cudaMallocAsync and cudaFreeAsync, that enable memory allocation and deallocation to be...]]>

In part 1 of this series, we introduced new API functions, and , that enable memory allocation and deallocation to be stream-ordered operations. In this post, we highlight the benefits of this new capability by sharing some big data benchmark results and provide a code migration guide for modifying your existing applications. We also cover advanced topics to take advantage of stream-ordered…

]]> 12 Jake Hemstad <![CDATA[Using the NVIDIA CUDA Stream-Ordered Memory Allocator, Part 1]]> http://www.open-lab.net/blog/?p=35109 2022-08-21T23:52:19Z 2021-07-27T20:46:43Z

Most CUDA developers are familiar with the cudaMalloc and cudaFree API functions to allocate GPU accessible memory. However, there has long been an obstacle...]]>

Most CUDA developers are familiar with the and API functions to allocate GPU accessible memory. However, there has long been an obstacle with these API functions: they aren’t stream ordered. In this post, we introduce new API functions, and , that enable memory allocation and deallocation to be stream-ordered operations. In part 2 of this series, we highlight the benefits of this new…

]]> 1 ��˳��97caoporen��