Author: J Wyman | NVIDIA Technical Blog

J Wyman

J (Jeremy) Wyman is a senior system software architect at NVIDIA specializing in AI and distributed systems. His work focuses on NVIDIA Triton Inference Server and the next generation of inference serving products and solutions from NVIDIA.

Posts by J Wyman

Conversational AI Oct 22, 2024

Scaling LLMs with NVIDIA Triton and NVIDIA TensorRT-LLM Using Kubernetes

Large language models (LLMs) have been widely used for chatbots, content generation, summarization, classification, translation, and more. State-of-the-art LLMs... 16 MIN READ