Scott Yokim – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-05-17T17:25:34Z http://www.open-lab.net/blog/feed/ Scott Yokim <![CDATA[Tensor Ops Made Easier in cuDNN]]> http://www.open-lab.net/blog/?p=11502 2022-08-21T23:38:58Z 2018-08-20T21:00:23Z Neural network models have quickly taken advantage of NVIDIA Tensor Cores for deep learning since their introduction in the Tesla V100 GPU last year. For...]]>

Neural network models have quickly taken advantage of NVIDIA Tensor Cores for deep learning since their introduction in the Tesla V100 GPU last year. For example, new performance records for ResNet50 training were announced recently with Tensor Core-based solutions. (See the NVIDIA developer post on new performance milestones for additional details). NVIDIA’s cuDNN library enables CUDA…

Source

]]>
1
Scott Yokim <![CDATA[Programming Tensor Cores in CUDA 9]]> http://www.open-lab.net/blog/parallelforall/?p=8496 2024-05-17T17:25:34Z 2017-10-17T09:29:09Z A defining feature of the new NVIDIA Volta GPU architecture is Tensor Cores, which give the NVIDIA V100 accelerator a peak throughput that is 12x...]]>

A defining feature of the new NVIDIA Volta GPU architecture is Tensor Cores, which give the NVIDIA V100 accelerator a peak throughput that is 12x the 32-bit floating point throughput of the previous-generation NVIDIA P100. Tensor Cores enable you to use mixed-precision for higher throughput without sacrificing accuracy. Tensor Cores provide a huge boost to convolutions and matrix operations.

Source

]]>
14
���˳���97caoporen����