There are some useful intrinsic functions in the NVIDIA GPU instruction set that are not included in standard graphics APIs. Updated from the original 2016 post to add information about new intrinsics and cross-vendor APIs in DirectX and Vulkan. For example, a shader can use warp shuffle instructions to exchange data between threads in a warp without going through shared memory��
]]>