J Wyman

J (Jeremy) Wyman is a senior system software architect at NVIDIA specializing in AI and distributed systems. His work focuses on NVIDIA Triton Inference Server and the next generation of inference serving products and solutions from NVIDIA.

Posts by J Wyman

Conversational AI

Scaling LLMs with NVIDIA Triton and NVIDIA TensorRT-LLM Using Kubernetes

Large language models (LLMs) have been widely used for chatbots, content generation, summarization, classification, translation, and more. State-of-the-art LLMs... 16 MIN READ