The NVIDIA Triton Inference Server, previously known as TensorRT Inference Server, is now available from NVIDIA NGC or via GitHub. The NVIDIA Triton Inference Server helps developers and IT/DevOps easily deploy a high-performance inference server in the cloud, in on-premises data center or at the edge. The server provides an inference service via an HTTP/REST or GRPC endpoint��
]]>