Earlier this year in March, we showed retinanet-examples, an open source example of how to accelerate the training and deployment of an object detection pipeline for GPUs. We presented the project at NVIDIA’s GPU Technology Conference in San Jose. This post discusses the motivation for this work, a high-level description of the architecture, and a brief look under-the-hood at the optimizations we…
]]>NVIDIA TensorRT is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning applications. NVIDIA released TensorRT last year with the goal of accelerating deep learning inference for production deployment. In this post we’ll introduce TensorRT 3, which improves performance versus previous versions and includes new…
]]>