Riva

Mar 11, 2025

Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of...

9 MIN READ

Two people sitting at their desks with icons for speech translation in the background.

Feb 20, 2025

Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a...

12 MIN READ

Dec 16, 2024

An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across...

12 MIN READ

Nov 13, 2024

Expanding AI Agent Interface Options with 2D and 3D Digital Human Avatars

When interfacing with generative AI applications, users have multiple communication options—text, voice, or through digital avatars. Traditional chatbot...

5 MIN READ

Sep 25, 2024

Build a Digital Human Interface for AI Apps with an NVIDIA NIM Agent Blueprint

Providing customers with quality service remains a top priority for businesses across industries, from answering questions and troubleshooting issues to...

5 MIN READ

Sep 18, 2024

Quickly Voice Your Apps with NVIDIA NIM Microservices for Speech and Translation

NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models...

11 MIN READ

Mar 19, 2024

NVIDIA Speech and Translation AI Models Set Records for Speed and Accuracy

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition...

8 MIN READ

Mar 18, 2024

How to Take a RAG Application from Pilot to Production in Four Steps

Generative AI has the potential to transform every industry. Human workers are already using large language models (LLMs) to explain, reason about, and solve...

8 MIN READ

Jan 09, 2024

Enhancing Phone Customer Service with ASR Customization

At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and...

7 MIN READ

Jan 08, 2024

Spotlight: Convai Reinvents Non-Playable Character Interactions

Convai is a versatile developer platform for designing characters with advanced multimodal perception abilities. These characters are designed to integrate...

5 MIN READ

Dec 04, 2023

Create Lifelike Avatars with AI Animation and Speech Features in NVIDIA ACE

NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered...

3 MIN READ

Nov 29, 2023

Boost Meeting Productivity with AI-Powered Note-Taking and Summarization

Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and...

6 MIN READ

Decorative image of groups of people using speech AI in different ways standing around a globe.

Nov 07, 2023

Video: Exploring Speech AI from Research to Practical Production Applications

The integration of speech and translation AI into our daily lives is rapidly reshaping our interactions, from virtual assistants to call centers and augmented...

2 MIN READ

Sep 20, 2023

Workshop: Building Conversational AI Applications

Learn how to build and deploy production-quality conversational AI apps with real-time transcription and NLP.

1 MIN READ

Image of two boxes with text, in two languages, with speech icons joining them to a central box symbolizing translation. The English language box displays, "One language is never enough."

Aug 29, 2023

How to Deploy NVIDIA Riva Speech and Translation AI in the Public Cloud

From start-ups to large enterprises, businesses use cloud marketplaces to find the new solutions needed to quickly transform their businesses. Cloud...

16 MIN READ

Image of glasses with computer screen reflected.

Jun 23, 2023

Speech AI Spotlight: Visualizing Spoken Language and Sounds on AR Glasses

Audio can include a wide range of sounds, from human speech to non-speech sounds like barking dogs and sirens. When designing accessible applications for people...

4 MIN READ