TensorRT? ??? Deep Learning ??? ????? NVIDIA GPU ???? Inference ??? ?? ~ ??? ?? ???? Deep Learning ??? TCO (Total Cost of Ownership) ? ????? ??? ? ? ?? ?? ??? ?????. ? ?????? Deep Learning ??? ?? ? ??? ???? ?? ?? ??? ? ?????? ?? TensorRT? ??? ????, ?? ??? TensorRT ?? ??? ?? ???? ???? ??? ???? ???. TensorRT? project? ????? ????, ?? ? ????? ???? ??? ??? ??, NVIDIA? Deep Learning ?? ???? Solutions Architect ? Data Scientist?? ??? ????? ????.
- Introduction
TensorRT? NVIDIA GPU ??? ??? ??? ???? ???? ??? ????? Optimizer? ??? GPU?? ????? ???? Runtime Engine? ?????. TensorRT? ???? Deep Learning Frameworks (TensorFlow, PyTorch ?) ?? ??? ??? ????, NVIDIA Datacenter, Automotive, Embedded ??? ? ???? NVIDIA GPU ???? ??? ???? ?? ????, ??? Deep Learning model Inference ??? ?????.

- TensorRT developments
TensorRT? C++ ? Python ??? API ???? ???? ?? ???, GPU programming language? CUDA ??? ??? ???? Deep Learning ??? ????? ?? ??? ? ???, Figure 2.? ??? ?? ??? ? ????.

TensorRT? GPU? ???? ?? ??? ??? ?? ??? ???? ??? ? ??? Runtime binary? ????? ???, Latency ? Throughput? ?? ???? ? ??, ?? ?? Deep Learning ?? ???? ? ???? ???? ??? ?????. ?? ?? ??? NVIDIA? Datacenter, Automotive, Embedded ?? ?? ???? ?? ??? Kernel? ???? ? ??, ? ????? ?? ??? ???? ?????. ??? ??? Deep Learning layer ? ??? ??? customization? ? ?? ???? ???? ?? ?? ????? ???? TensorRT? ??? ? ??? ??? ????. ?? ??? ??? ????? ?? ???? ???? ??? ?? ??? ?????.
- TensorRT optimizations
TensorRT? NVIDIA platform?? ??? Inference ??? ? ? ??? Network compression, Network optimization ??? GPU ??? ???? ?? Deep Learning ??? ???? ?????. ????, TensorRT?? Inference ??? ?? ???? ???? ???? ??? ????? ?????.

- Quantization & Precision Calibration
?? Deep Learning Training ? Inference??? Precision reduction ? ?? ???? ??? ?????. ?? Precision? Network? ?? data? ?? ? weight?? bit?? ?? ??? ? ??? ???? ??? ?????. ?? ?? Quantization ??? ?, TensorRT? Symmetric Linear Quantization(Figure 4.)? ???? ???, ?? ??? Deep Learning Framework? ???? FP32? data? FP16 ? INT8? data type?? precision? ?? ? ????.

????? FP16??? precision down-scale? Network? accuracy drop? ? ??? ??? ???, INT8?? down-scale? accuracy drop? ??? ? ??? Network? ????? ???? calibration ???? ?????. (Figure 5. ??) ?? ??? TensorRT???EntronpyCalibrator, EntropyCalibrator2 ??? MinMaxCalibrator? ???? ???, ?? ???? quantization? weight ? intermediate tensor?? ??? ??? ??? ? ? ????.

- Graph Optimization
????? Graph Optimization? Deep Learning Network?? ???? primitive ?? ??, compound ?? ??? graph node?? ? platform? ???? code? ???? ??? ?????. TensorRT??? ?? ???? Layer Fusion ??? Tensor Fusion ??? ??? ???? ????. Figure 6? ?? Layer Fusion? Vertical Layer Fusion, Horizontal Layer Fusion ??? Tensor Fusion? ???? model graph? ??? ???? ?? ??? model? layer ??? ?? ???? ???.

- Kernel Auto-tuning
?? ??? ?? ?? TensorRT? NVIDIA? ??? platform ? architecture? ?? Runtime ??? ?? ???. ? ???? CUDA engine? ??, architecture, memory ??? specialized engine ?? ??? ?? optimize? kernel ? ??? ??? ?? TensorRT Runtime engine build?? ????? ???? ??? engine binary ??? ????.
- Dynamic Tensor Memory & Multi-stream execution
? ??? Memory management? ??? footprint? ?? ???? ? ? ??? ???? Dynamic tensor memory ??? ?? ??, CUDA stream ??? ???? multiple input stream? scheduling? ?? ?? ??? ??? ? ? ?? Multi-stream execution?? ?? ?????.
- TensorRT performances
?? ??? TenosrRT? ???? ??? ?? ? ?? ?? ??? ????? ?????. ???? ResNet50 ???? ?? ??? GPU?? TensorRT? ???? ????? ?? 8? ??? ?? ?? ??? ????. (Figure 7. ??) ??? ?? ?? ?? ??? ??? ??? TensorRT Inference? Deep Learning ???? ???? ??? ??? ???? ??? ??? ? ??? ?? ?? ? ????. ?? ??? ?? ?? ??? ??? URL (http://www.open-lab.net/deep-learning-performance-training-Inference#deeplearningperformance_Inference) ?? NVIDIA platform ? Network?? benchmark?? ??? ?? ? ? ????.

- ??
???? NVIDIA? Deep Learning Inference ??? ?? solution? TensorRT? ??? ???????. TensorRT? ??? Deep Learning Framework? ???? ?? training ? Neural Network?? ? domain? ?? NVIDIA? GPU ????? ????? Inference? ?? ?? Toolkit ?? library???. ?? ???? ??? ? ?? ??? ?? Deep Learning ?? ??? ?? ?? ??? ?? ??? ??? ???? ????. ???? ?? ?? ?? ? ?? ?? ?? ?? ??? NVIDIA?? ??? ??? ???? ???? ?? ??? ????.