Nikolay Markovskiy – NVIDIA Technical Blog

Nikolay Markovskiy – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2023-07-05T19:23:50Z http://www.open-lab.net/blog/feed/ Nikolay Markovskiy <![CDATA[Setting New Records in MLPerf Inference v3.0 with Full-Stack Optimizations for AI]]> http://www.open-lab.net/blog/?p=62958 2023-07-05T19:23:50Z 2023-04-05T19:10:55Z

The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment...]]>

The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment scenarios. High-performance, accelerated AI platforms are needed to meet the demands of these applications and deliver the best user experiences. New AI models are constantly being invented to enable new capabilities…

]]> 0 Nikolay Markovskiy <![CDATA[Neural Machine Translation Inference with TensorRT 4]]> http://www.open-lab.net/blog/?p=17146 2023-03-14T19:00:03Z 2018-07-18T19:00:00Z

Neural machine translation exists across a wide variety consumer applications, including web sites, road signs, generating subtitles in foreign languages, and...]]>

Neural machine translation exists across a wide variety consumer applications, including web sites, road signs, generating subtitles in foreign languages, and more. TensorRT, NVIDIA’s programmable inference accelerator, helps optimize and generate runtime engines for deploying deep learning inference apps to production environments. NVIDIA released TensorRT 4 with new features to accelerate…

]]> 2 Nikolay Markovskiy <![CDATA[Drop-in Acceleration of GNU Octave]]> http://www.open-lab.net/blog/parallelforall/?p=3188 2022-08-21T23:37:04Z 2014-06-05T16:20:10Z

cuBLAS?is an implementation of the?BLAS library that leverages the teraflops of performance provided by NVIDIA GPUs.? However, cuBLAS can not be used as a...]]>

cuBLAS is an implementation of the BLAS library that leverages the teraflops of performance provided by NVIDIA GPUs. However, cuBLAS can not be used as a direct BLAS replacement for applications originally intended to run on the CPU. In order to use the cuBLAS API: Such an API permits the fine tuning required to minimize redundant data copies to and from the GPU in arbitrarily complicated…

]]> 10 ��˳��97caoporen��