Approximately 220 teams gathered at the Open Data Science Conference (ODSC) West this year to compete in the NVIDIA hackathon, a 24-hour machine learning (ML) competition. Data scientists and engineers designed models that were evaluated based on accuracy and processing speed. The top three teams walked away with prize packages that included NVIDIA RTX Ada Generation GPUs, Google Colab credits…
]]>Includes C++ runtime support in Windows Support, Enhanced Dynamic Shape support in Converters, PyTorch 2.4, CUDA 12.4, TensorRT 10.1, Python 3.12.
]]>As large language models (LLMs) continue to grow in size and complexity, the performance requirements for serving them quickly and cost-effectively continue to grow. Delivering high LLM inference performance requires an efficient parallel computing architecture and a flexible and highly optimized software stack. Recently, NVIDIA Hopper GPUs running NVIDIA TensorRT-LLM inference software set…
]]>At Google I/O 2024, Google announced Firebase Genkit, a new open-source framework for developers to add generative AI to web and mobile applications using models like Google Gemini, Google Gemma. With Firebase Genkit, you can build apps that integrate intelligent agents, automate customer support, use semantic search, and convert unstructured data into insights. Genkit also includes a developer UI…
]]>NVIDIA today announced the latest release of NVIDIA TensorRT, an ecosystem of APIs for high-performance deep learning inference. TensorRT includes inference runtimes and model optimizations that deliver low latency and high throughput for production applications. This post outlines the key features and upgrades of this release, including easier installation, increased usability…
]]>