CUDA

Oct 09, 2024

Just Released: Updated Math Libraries in CUDA Toolkit 12.6.2

CUDA Toolkit 12.6.2 improves performance and provides new features in cuBLAS, cuSOLVER, and cuFFT LTO libraries.

1 MIN READ

Oct 07, 2024

Accelerating Reality Capture Workflows with AI and NVIDIA RTX GPUs

Reality capture creates highly accurate, detailed, and immersive digital representations of environments. Innovations in site scanning and accelerated data...

10 MIN READ

Oct 02, 2024

Webinar: Accelerating Python with GPUs

Join us on October 9 to learn how your applications can benefit from NVIDIA CUDA Python software initiatives.

1 MIN READ

Oct 02, 2024

Accelerating LLMs with llama.cpp on NVIDIA RTX Systems

The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...

5 MIN READ

Sep 30, 2024

Advancing Quantum Algorithm Design with GPTs

AI techniques like large language models (LLMs) are rapidly transforming many scientific disciplines. Quantum computing is no exception. A collaboration between...

8 MIN READ

Sep 24, 2024

Accelerating Leaderboard-Topping ASR Models 10x with NVIDIA NeMo

NVIDIA NeMo has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry, particularly those topping the Hugging...

13 MIN READ

Sep 19, 2024

Just Released: Torch-TensorRT v2.4.0

Includes C++ runtime support in Windows Support, Enhanced Dynamic Shape support in Converters, PyTorch 2.4, CUDA 12.4, TensorRT 10.1, Python 3.12.

1 MIN READ

Sep 17, 2024

Accelerating Oracle Database Generative AI Workloads with NVIDIA NIM and NVIDIA cuVS

The vast majority of the world's data remains untapped, and enterprises are looking to generate value from this data by creating the next wave of generative AI...

6 MIN READ

Sep 11, 2024

Advanced Strategies for High-Performance GPU Programming with NVIDIA CUDA

Stephen Jones, a leading expert and distinguished NVIDIA CUDA architect, offers his guidance and insights with a deep dive into the complexities of mapping...

2 MIN READ

Decorative image of light fields in green, purple, and blue.

Sep 11, 2024

Constant Time Launch for Straight-Line CUDA Graphs and Other Performance Enhancements

CUDA Graphs are a way to define and batch GPU operations as a graph rather than a sequence of stream launches. A CUDA Graph groups a set of CUDA kernels and...

8 MIN READ

Sep 10, 2024

Accelerating the HPCG Benchmark with NVIDIA Math Sparse Libraries

In the realm of high-performance computing (HPC), NVIDIA has continually advanced HPC by offering its highly optimized NVIDIA High-Performance Conjugate...

9 MIN READ

Sep 06, 2024

Enhancing Application Portability and Compatibility across New Platforms Using NVIDIA Magnum IO NVSHMEM 3.0

NVSHMEM is a parallel programming interface that provides efficient and scalable communication for NVIDIA GPU clusters. Part of NVIDIA Magnum IO and based on...

7 MIN READ

Aug 29, 2024

Spotlight: clicOH Accelerates Last-Mile Delivery 20x with NVIDIA cuOpt

Driven by shifts in consumer behavior and the pandemic, e-commerce continues its explosive growth and transformation. As a result, logistics and transportation...

3 MIN READ

Aug 29, 2024

Boosting CUDA Efficiency with Essential Techniques for New Developers

To fully harness the capabilities of NVIDIA GPUs, optimizing NVIDIA CUDA performance is essential, particularly for developers new to GPU programming. This talk...

2 MIN READ

Aug 08, 2024

Improving GPU Performance by Reducing Instruction Cache Misses

GPUs are specially designed to crunch through massive amounts of data at high speed. They have a large amount of compute resources, called streaming...

11 MIN READ

Aug 07, 2024

Optimizing llama.cpp AI Inference with CUDA Graphs

The open-source llama.cpp code base was originally released in 2023 as a lightweight but efficient framework for performing inference on Meta Llama models....

8 MIN READ