Riva – NVIDIA Technical Blog

Riva – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-27T16:00:00Z http://www.open-lab.net/blog/feed/ Shubham Agrawal <![CDATA[Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization]]> http://www.open-lab.net/blog/?p=96842 2025-03-12T22:08:59Z 2025-03-11T17:30:00Z

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of...]]>

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of...

xr-manufacturing-robot-arm

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of highly sophisticated perception and deep contextual understanding. These intelligent solutions offer a promising means of enhancing semantic comprehension in XR settings. By integrating VLMs, developers can significantly improve how XR��

]]> 0 Sven Chilton <![CDATA[Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT]]> http://www.open-lab.net/blog/?p=95339 2025-03-06T19:26:55Z 2025-02-20T18:54:48Z

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a...]]>

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a... Two people sitting at their desks with icons for speech translation in the background.

Two people sitting at their desks with icons for speech translation in the background.

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a collection of GPU-accelerated speech and translation AI microservices for ASR, TTS, and NMT, support English-Spanish and English-Japanese code-switching ASR models based on the Conformer architecture, along with a model supporting multiple��

]]> 0 Tanay Varshney <![CDATA[An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio]]> http://www.open-lab.net/blog/?p=93893 2024-12-16T21:53:48Z 2024-12-16T17:00:00Z

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across...]]>

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across...

computer-monitor-video-audio-icons

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across multiple modalities, including text, images, tables, audio, video, and more. In our previous post, An Easy Introduction to Multimodal Retrieval-Augmented Generation, we discussed how to tackle text and images. This post extends this conversation��

]]> 0 Trisha Tripathi <![CDATA[Expanding AI Agent Interface Options with 2D and 3D Digital Human Avatars]]> http://www.open-lab.net/blog/?p=91882 2024-11-14T17:10:33Z 2024-11-14T00:53:23Z

When interfacing with generative AI applications, users have multiple communication options��text, voice, or through digital avatars. Traditional chatbot...]]>

When interfacing with generative AI applications, users have multiple communication options��text, voice, or through digital avatars. Traditional chatbot...

Digital Humans AI Agent Interface Options

When interfacing with generative AI applications, users have multiple communication options��text, voice, or through digital avatars. Traditional chatbot or copilot applications have text interfaces where users type in queries and receive text-based responses. For hands-free communication, speech AI technologies like automatic speech recognition (ASR) and text-to-speech (TTS) facilitate��

]]> 1 Vinay Bagade <![CDATA[Build a Digital Human Interface for AI Apps with an NVIDIA NIM Agent Blueprint]]> http://www.open-lab.net/blog/?p=89345 2024-10-22T20:34:33Z 2024-09-25T20:30:00Z

Providing customers with quality service remains a top priority for businesses across industries, from answering questions and troubleshooting issues to...]]>

Providing customers with quality service remains a top priority for businesses across industries, from answering questions and troubleshooting issues to...

digital-human-interface-representation

Providing customers with quality service remains a top priority for businesses across industries, from answering questions and troubleshooting issues to facilitating online orders. As businesses scale operations and expand offerings globally to compete, the demand for seamless customer service grows exponentially. Searching knowledge base articles or navigating complex phone trees can be a��

]]> 0 Sven Chilton <![CDATA[Quickly Voice Your Apps with NVIDIA NIM Microservices for Speech and Translation]]> http://www.open-lab.net/blog/?p=89142 2024-09-19T20:17:19Z 2024-09-18T22:48:43Z

NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models...]]>

NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models...

conversational-ai-graphic

NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and workstations. NIM microservices for speech and translation are now available. The new speech and translation microservices leverage NVIDIA Riva and provide automatic speech recognition (ASR)��

]]> 0 Gordana Neskovic <![CDATA[NVIDIA Speech and Translation AI Models Set Records for Speed and Accuracy]]> http://www.open-lab.net/blog/?p=79365 2024-08-12T16:09:12Z 2024-03-19T16:00:00Z

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition...]]>

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition...

speech-ai-composite-graphic

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition (ASR) family of models and the NVIDIA Canary multilingual, multitask ASR and translation model currently top the Hugging Face Open ASR Leaderboard. In addition, a multilingual P-Flow-based text-to-speech (TTS) model won the LIMMITS ��24��

]]> 0 Jacob Liberman <![CDATA[How to Take a RAG Application from Pilot to Production in Four Steps]]> http://www.open-lab.net/blog/?p=79558 2024-10-28T21:58:37Z 2024-03-18T22:00:00Z

Generative AI has the potential to transform every industry. Human workers are already using large language models (LLMs) to explain, reason about, and solve...]]>

Generative AI has the potential to transform every industry. Human workers are already using large language models (LLMs) to explain, reason about, and solve...

graphic-with-cloud-computer-text-woman

Generative AI has the potential to transform every industry. Human workers are already using large language models (LLMs) to explain, reason about, and solve difficult cognitive tasks. Retrieval-augmented generation (RAG) connects LLMs to data, expanding the usefulness of LLMs by giving them access to up-to-date and accurate information. Many enterprises have already started to explore how��

]]> 0 Pawe? Budzianowski <![CDATA[Enhancing Phone Customer Service with ASR Customization]]> http://www.open-lab.net/blog/?p=75584 2024-01-25T18:17:37Z 2024-01-09T17:00:00Z

At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and...]]>

At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and... Decorative image.

Decorative image.

At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and automate customer service interactions over the phone, companies must solve the unique challenge of gaining a caller��s trust through qualities such as understanding, empathy, and clarity. Telephony-bound voice is inherently challenging��

]]> 0 Yasmina Benkhoui <![CDATA[Spotlight: Convai Reinvents Non-Playable Character Interactions]]> http://www.open-lab.net/blog/?p=76184 2024-01-25T18:17:40Z 2024-01-08T16:30:00Z

Convai is a versatile developer platform for designing characters with advanced multimodal perception abilities. These characters are designed to integrate...]]>

Convai is a versatile developer platform for designing characters with advanced multimodal perception abilities. These characters are designed to integrate...

two-characters-in-front-of-buildings

Convai is a versatile developer platform for designing characters with advanced multimodal perception abilities. These characters are designed to integrate seamlessly into both the virtual and real worlds. Whether you��re a creator, game designer, or developer, Convai enables you to quickly modify a non-playable character (NPC), from backstory and knowledge to voice and personality.

]]> 0 Ike Nnoli <![CDATA[Create Lifelike Avatars with AI Animation and Speech Features in NVIDIA ACE]]> http://www.open-lab.net/blog/?p=74159 2024-11-20T23:02:47Z 2023-12-04T22:00:00Z

NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered...]]>

NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered...

NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered avatars and digital humans. These latest animation and speech capabilities enable more natural conversations and emotional expressions. Developers can now easily implement and scale intelligent avatars across applications using new��

]]> 0 Mohamed Elshenawy <![CDATA[Boost Meeting Productivity with AI-Powered Note-Taking and Summarization]]> http://www.open-lab.net/blog/?p=73964 2023-12-14T19:27:34Z 2023-11-29T21:00:00Z

Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and...]]>

Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and...

trascription-graphic

Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and problem-solving. And they further strategic goals and planning. Yet, leading meetings that accomplish these goals��especially those involving cross-functional teams and external participants��can be challenging. A unique blend of people��

]]> 0 Belen Tegegn <![CDATA[Video: Exploring Speech AI from Research to Practical Production Applications]]> http://www.open-lab.net/blog/?p=72433 2023-11-16T19:16:46Z 2023-11-07T16:07:22Z

The integration of speech and translation AI into our daily lives is rapidly reshaping our interactions, from virtual assistants to call centers and augmented...]]>

The integration of speech and translation AI into our daily lives is rapidly reshaping our interactions, from virtual assistants to call centers and augmented... Decorative image of groups of people using speech AI in different ways standing around a globe.

Decorative image of groups of people using speech AI in different ways standing around a globe.

The integration of speech and translation AI into our daily lives is rapidly reshaping our interactions, from virtual assistants to call centers and augmented reality experiences. Speech AI Day provided valuable insights into the latest advancements in speech AI, showcasing how this technology addresses real-world challenges. In this first of three Speech AI Day sessions��

]]> 0 Tanya Lenz <![CDATA[Workshop: Building Conversational AI Applications]]> http://www.open-lab.net/blog/?p=70919 2023-11-03T07:14:57Z 2023-09-20T17:00:00Z

Learn how to build and deploy production-quality conversational AI apps with real-time transcription and NLP.]]>

Learn how to build and deploy production-quality conversational AI apps with real-time transcription and NLP.

dli-social-convai-workshop-and-scaling-gpu-1920x1080

Learn how to build and deploy production-quality conversational AI apps with real-time transcription and NLP.

]]> 0 Sven Chilton <![CDATA[How to Deploy NVIDIA Riva Speech and Translation AI in the Public Cloud]]> http://www.open-lab.net/blog/?p=69702 2023-10-20T18:13:34Z 2023-08-29T17:00:00Z

From start-ups to large enterprises, businesses use cloud marketplaces to find the new solutions needed to quickly transform their businesses. Cloud...]]>

From start-ups to large enterprises, businesses use cloud marketplaces to find the new solutions needed to quickly transform their businesses. Cloud... Image of two boxes with text, in two languages, with speech icons joining them to a central box symbolizing translation. The English language box displays,

Image of two boxes with text, in two languages, with speech icons joining them to a central box symbolizing translation. The English language box displays,

From start-ups to large enterprises, businesses use cloud marketplaces to find the new solutions needed to quickly transform their businesses. Cloud marketplaces are online storefronts where customers can purchase software and services with flexible billing models, including pay-as-you-go, subscriptions, and privately negotiated offers. Businesses further benefit from committed spending at��

]]> 0 Sirisha Rella <![CDATA[Speech AI Spotlight: Visualizing Spoken Language and Sounds on AR Glasses]]> http://www.open-lab.net/blog/?p=66701 2023-07-13T19:00:30Z 2023-06-23T15:00:00Z

Audio can include a wide range of sounds, from human speech to non-speech sounds like barking dogs and sirens. When designing accessible applications for people...]]>

Audio can include a wide range of sounds, from human speech to non-speech sounds like barking dogs and sirens. When designing accessible applications for people... Image of glasses with computer screen reflected.

Image of glasses with computer screen reflected.

Audio can include a wide range of sounds, from human speech to non-speech sounds like barking dogs and sirens. When designing accessible applications for people with hearing difficulties, the application should be able to recognize sounds and understand speech. Such technology would help deaf or hard-of-hearing individuals with visualizing speech, like human conversations and non-speech��

]]> 1 Caroline Gottlieb <![CDATA[Unlocking Speech AI Technology for Global Language Users: Top Q&As]]> http://www.open-lab.net/blog/?p=66216 2023-11-03T07:15:00Z 2023-06-06T17:00:00Z

Voice-enabled technology is becoming ubiquitous. But many are being left behind by an anglocentric and demographically biased algorithmic world. Mozilla Common...]]>

Voice-enabled technology is becoming ubiquitous. But many are being left behind by an anglocentric and demographically biased algorithmic world. Mozilla Common...

speech-ai-summit-graphic

Voice-enabled technology is becoming ubiquitous. But many are being left behind by an anglocentric and demographically biased algorithmic world. Mozilla Common Voice (MCV) and NVIDIA are collaborating to change that by partnering on a public crowdsourced multilingual speech corpus��now the largest of its kind in the world��and open-source pretrained models. It is now easier than ever before to��

]]> 0 Vishal Manchanda <![CDATA[How Language Neutralization Is Transforming Customer Service Contact Centers]]> http://www.open-lab.net/blog/?p=65761 2023-10-30T23:18:55Z 2023-05-30T22:58:34Z

According to Gartner,? "Nearly half of digital workers struggle to find the data they need to do their jobs, and close to one-third have made a wrong business...]]>

According to Gartner,? "Nearly half of digital workers struggle to find the data they need to do their jobs, and close to one-third have made a wrong business...

transcription-graphic

According to Gartner,? ��Nearly half of digital workers struggle to find the data they need to do their jobs, and close to one-third have made a wrong business decision due to lack of information awareness.��1 To address this challenge, more and more enterprises are deploying AI in customer service, as it helps to provide more efficient and information-based personalized services.

]]> 0 Swaroop Kumar <![CDATA[Enhancing Customer Experience in Telecom with NVIDIA Customized Speech AI]]> http://www.open-lab.net/blog/?p=65421 2023-10-30T23:24:55Z 2023-05-30T15:00:00Z

The telecom sector is transforming how communication happens. Striving to provide reliable, uninterrupted service, businesses are tackling the challenge of...]]>

The telecom sector is transforming how communication happens. Striving to provide reliable, uninterrupted service, businesses are tackling the challenge of... Image of a chatbot as the interface between customers, with speech bubbles.

Image of a chatbot as the interface between customers, with speech bubbles.

The telecom sector is transforming how communication happens. Striving to provide reliable, uninterrupted service, businesses are tackling the challenge of delivering an optimal customer experience. This optimal customer experience is something many long-time customers of large telecom service providers do not have. Take Jack, for example. His call was on hold for 10 minutes��

]]> 0 Ike Nnoli <![CDATA[Generative AI Sparks Life into Virtual Characters with NVIDIA ACE for Games]]> http://www.open-lab.net/blog/?p=65490 2024-11-20T23:04:23Z 2023-05-29T03:30:00Z

Generative AI technologies are revolutionizing how games are conceived, produced, and played. Game developers are exploring how these technologies impact 2D and...]]>

Generative AI technologies are revolutionizing how games are conceived, produced, and played. Game developers are exploring how these technologies impact 2D and... Game NPC scene in ramen shop

Game NPC scene in ramen shop

Generative AI technologies are revolutionizing how games are conceived, produced, and played. Game developers are exploring how these technologies impact 2D and 3D content-creation pipelines during production. Part of the excitement comes from the ability to create gaming experiences at runtime that would have been impossible using earlier solutions. The creation of non-playable characters��

]]> 0 Michelle Horton <![CDATA[Webinar: Empower Telco Contact Center Agents with Multi-Language Speech-AI-Customized Agent Assists]]> http://www.open-lab.net/blog/?p=64106 2023-08-18T20:52:27Z 2023-05-16T16:00:00Z

Join Infosys, NVIDIA, and Quantiphi on June 7 to learn how to use speech and translation AI to improve agent-assist solutions in multiple languages.]]>

Join Infosys, NVIDIA, and Quantiphi on June 7 to learn how to use speech and translation AI to improve agent-assist solutions in multiple languages.

speech-ai-priva-telco-technical-webinar-1920x1080

Join Infosys, NVIDIA, and Quantiphi on June 7 to learn how to use speech and translation AI to improve agent-assist solutions in multiple languages.

]]> 0 Angie Lee <![CDATA[Explainer: What Is Agent Assist?]]> http://www.open-lab.net/blog/?p=64129 2023-06-13T17:35:33Z 2023-05-04T18:00:00Z

Agent-assist technology uses AI and ML to provide facts and make real-time suggestions that help human agents across retail, telecom, and other industries...]]>

Agent-assist technology uses AI and ML to provide facts and make real-time suggestions that help human agents across retail, telecom, and other industries... Image of headphones with speech bubbles saying hello in five languages.

Image of headphones with speech bubbles saying hello in five languages.

Agent-assist technology uses AI and ML to provide facts and make real-time suggestions that help human agents across retail, telecom, and other industries conduct conversations with customers.

]]> 0 Kristen Rumley <![CDATA[How Speech Recognition Improves Customer Service in Telecommunications]]> http://www.open-lab.net/blog/?p=63789 2023-11-03T07:15:01Z 2023-05-02T16:00:00Z

The telecommunication industry has seen a proliferation of AI-powered technologies in recent years, with speech recognition and translation leading the charge....]]>

The telecommunication industry has seen a proliferation of AI-powered technologies in recent years, with speech recognition and translation leading the charge....

How Speech Recognition Improves Customer Service in Telecommunications

The telecommunication industry has seen a proliferation of AI-powered technologies in recent years, with speech recognition and translation leading the charge. Multi-lingual AI virtual assistants, digital humans, chatbots, agent assists, and audio transcription are technologies that are revolutionizing the telco industry. Businesses are implementing AI in call centers to address incoming requests��

]]> 0 Michelle Horton <![CDATA[Webinar: How Telcos Transform Customer Experiences with Conversational AI]]> http://www.open-lab.net/blog/?p=64118 2023-08-18T20:52:43Z 2023-05-02T16:00:00Z

Join Infosys, Quantiphi, Talkmap, and NVIDIA on May 31 for a live webinar to learn how telecommunications companies are using AI to improve operational...]]>

Join Infosys, Quantiphi, Talkmap, and NVIDIA on May 31 for a live webinar to learn how telecommunications companies are using AI to improve operational...

speech-ai-priva-telco-non-technical-webinar-devnews-1920x1080

Join Infosys, Quantiphi, Talkmap, and NVIDIA on May 31 for a live webinar to learn how telecommunications companies are using AI to improve operational efficiency and enhance customer engagement.

]]> 0 Sirisha Rella <![CDATA[Exploring Unique Applications of Text-to-Speech Technology]]> http://www.open-lab.net/blog/?p=62914 2023-06-09T22:32:18Z 2023-04-19T18:29:00Z

When interacting with a virtual assistant, you give a command and receive a verbal response. The technology powering this generated voice response is known as...]]>

When interacting with a virtual assistant, you give a command and receive a verbal response. The technology powering this generated voice response is known as... Image of person looking at a virtual avatar in office environment.

When interacting with a virtual assistant, you give a command and receive a verbal response. The technology powering this generated voice response is known as text-to-speech (TTS). TTS applications are highly useful as they enable greater content accessibility for those who use assistive devices. With the latest TTS techniques, you can generate a synthetic voice from only a few minutes of��

]]> 0 Kristen Rumley <![CDATA[Workshop: How to Enable Your Product with Voice Interface]]> http://www.open-lab.net/blog/?p=63525 2023-08-18T20:50:31Z 2023-04-19T16:29:08Z

This hands-on workshop guides you through the process of voice-enabling your product, from familiarizing yourself with NVIDIA Riva to assessing the costs and...]]>

This hands-on workshop guides you through the process of voice-enabling your product, from familiarizing yourself with NVIDIA Riva to assessing the costs and...

convai-gtc22-fall-launchpad-yt-1920x1080

This hands-on workshop guides you through the process of voice-enabling your product, from familiarizing yourself with NVIDIA Riva to assessing the costs and resources required for your project.

]]> 0 Sean Wagstaff <![CDATA[Create XR Experiences Using Natural-Language Voice Commands: Test Project Mellon]]> http://www.open-lab.net/blog/?p=62285 2023-11-03T07:15:05Z 2023-03-23T15:00:00Z

Project Mellon is a lightweight Python package capable of harnessing the heavyweight power of speech AI (NVIDIA Riva) and large language models (LLMs) (NVIDIA...]]>

Project Mellon is a lightweight Python package capable of harnessing the heavyweight power of speech AI (NVIDIA Riva) and large language models (LLMs) (NVIDIA...

Riva-Project-Mellon

Project Mellon is a lightweight Python package capable of harnessing the heavyweight power of speech AI (NVIDIA Riva) and large language models (LLMs) (NVIDIA NeMo service) to simplify user interactions in immersive environments. NVIDIA announced at NVIDIA GTC 2023 that developers can start testing Project Mellon to explore creating hands-free extended reality (XR) experiences controlled by��

]]> 1 Nicola Sessions <![CDATA[ICYMI: New and Updated AI Workflows Announced at NVIDIA GTC 2023]]> http://www.open-lab.net/blog/?p=62115 2023-03-30T17:43:51Z 2023-03-22T15:00:00Z

NVIDIA showed how AI workflows can be leveraged to help you accelerate the development of AI solutions to address a range of use cases at NVIDIA GTC 2023. AI...]]>

NVIDIA showed how AI workflows can be leveraged to help you accelerate the development of AI solutions to address a range of use cases at NVIDIA GTC 2023. AI... Gif scrolling through different workflow examples such as personalized recommenders, speech AI, and route optimization.

Gif scrolling through different workflow examples such as personalized recommenders, speech AI, and route optimization.

NVIDIA showed how AI workflows can be leveraged to help you accelerate the development of AI solutions to address a range of use cases at NVIDIA GTC 2023. AI workflows are cloud-native, packaged reference examples showing how NVIDIA AI frameworks can be used to efficiently build AI solutions such as intelligent virtual assistants, digital fingerprinting for cybersecurity��

]]> 0 Michelle Horton <![CDATA[Top Video Streaming and Conferencing Sessions at NVIDIA GTC 2023]]> http://www.open-lab.net/blog/?p=61472 2023-11-02T20:17:49Z 2023-03-03T17:00:00Z

Learn about advancements in video conferencing that have transformed how we communicate.]]>

Learn about advancements in video conferencing that have transformed how we communicate. A graphic of a business building, car, and telecommunications pole with a beam connecting them.

A graphic of a business building, car, and telecommunications pole with a beam connecting them.

Learn about advancements in video conferencing that have transformed how we communicate.

]]> 0 Michelle Horton <![CDATA[Top Conversational AI Sessions at NVIDIA GTC 2023]]> http://www.open-lab.net/blog/?p=61425 2023-03-09T19:18:45Z 2023-02-28T19:30:27Z

Learn about the latest tools, trends, and technologies for building and deploying conversational AI.]]>

Learn about the latest tools, trends, and technologies for building and deploying conversational AI. 4 images showing different applications for conversational AI such as virtual assistants and avatars.

4 images showing different applications for conversational AI such as virtual assistants and avatars.

Learn about the latest tools, trends, and technologies for building and deploying conversational AI.

]]> 0 Michelle Horton <![CDATA[Top Speech AI Developer Day Sessions at NVIDIA GTC 2023]]> http://www.open-lab.net/blog/?p=60997 2023-03-14T19:01:01Z 2023-02-14T22:00:00Z

Explore the latest advances in accurate and customizable automatic speech recognition, multi-language translation, and text-to-speech.]]>

Explore the latest advances in accurate and customizable automatic speech recognition, multi-language translation, and text-to-speech. Black background with bright green sound waves and GTC banner in the corner.

Black background with bright green sound waves and GTC banner in the corner.

Explore the latest advances in accurate and customizable automatic speech recognition, multi-language translation, and text-to-speech.

]]> 0 David Taubenheim <![CDATA[Speech AI Spotlight: How Pendulum Nabs Harmful Narratives Online]]> http://www.open-lab.net/blog/?p=60694 2023-11-03T07:15:05Z 2023-02-08T17:00:00Z

Over 55% of the global population uses social media, easily sharing online content with just one click. While connecting with others and consuming entertaining...]]>

Over 55% of the global population uses social media, easily sharing online content with just one click. While connecting with others and consuming entertaining...

speech-ai-spotlight-story-pendulum-solution-workflow-featured-image

Over 55% of the global population uses social media, easily sharing online content with just one click. While connecting with others and consuming entertaining content, you can also spot harmful narratives posing real-life threats. That��s why VP of Engineering at Pendulum, Ammar Haris, wants his company��s AI to help clients to gain deeper insight into the harmful content being generated��

]]> 1 Michelle Horton <![CDATA[New Hands-on Lab: Intelligent Virtual Assistant]]> http://www.open-lab.net/blog/?p=60074 2023-06-12T07:58:15Z 2023-01-24T21:00:00Z

Learn to build an engaging and intelligent virtual assistant with NVIDIA AI workflows powered by NVIDIA Riva in this free hands-on lab from NVIDIA LaunchPad,]]>

Learn to build an engaging and intelligent virtual assistant with NVIDIA AI workflows powered by NVIDIA Riva in this free hands-on lab from NVIDIA LaunchPad, Illustration of a cell phone with a virtual assistant helping with a purchase.

Illustration of a cell phone with a virtual assistant helping with a purchase.

Learn to build an engaging and intelligent virtual assistant with NVIDIA AI workflows powered by NVIDIA Riva in this free hands-on lab from NVIDIA LaunchPad��

]]> 0 Maggie Zhang <![CDATA[Autoscaling NVIDIA Riva Deployment with Kubernetes for Speech AI in Production]]> http://www.open-lab.net/blog/?p=59514 2023-10-20T18:16:30Z 2023-01-12T17:30:00Z

Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process...]]>

Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process... Graphic with computer, cloud, and GPU icons

Graphic with computer, cloud, and GPU icons

Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process the audio signal and transcribe the audio to text. Speech synthesis or TTS can generate high-quality, natural-sounding audio from the text in real time. The challenge of Speech AI is to achieve high accuracy and meet the latency requirements��

]]> 0 Gordana Neskovic <![CDATA[Upcoming Webinar: Building an Intelligent Virtual Assistant for Financial Services]]> http://www.open-lab.net/blog/?p=59470 2023-08-18T20:43:13Z 2023-01-11T17:00:00Z

Join this webinar on January 25 and learn how to build a voice-enabled intelligent virtual assistant to improve customer experiences at contact centers.]]>

Join this webinar on January 25 and learn how to build a voice-enabled intelligent virtual assistant to improve customer experiences at contact centers. Graphic of a woman sitting on a couch reading her phone with popups.

Graphic of a woman sitting on a couch reading her phone with popups.

Join this webinar on January 25 and learn how to build a voice-enabled intelligent virtual assistant to improve customer experiences at contact centers.

]]> 0 Sirisha Rella <![CDATA[Speech AI Technology Enables Natural Interactions with Service Robots]]> http://www.open-lab.net/blog/?p=59175 2023-06-12T08:17:08Z 2022-12-17T00:23:07Z

From taking your order and serving you food in a restaurant to playing poker with you, service robots are becoming increasingly prevalent. Globally, you can...]]>

From taking your order and serving you food in a restaurant to playing poker with you, service robots are becoming increasingly prevalent. Globally, you can...

smart-retail-robot

From taking your order and serving you food in a restaurant to playing poker with you, service robots are becoming increasingly prevalent. Globally, you can find these service robots at hospitals, airports, and retail stores. According to Gartner, by 2030, 80% of humans will engage with smart robots daily, due to smart robot advancements in intelligence, social interactions��

]]> 0 Sven Chilton <![CDATA[Reducing Development Time for Intelligent Virtual Assistants in Contact Centers]]> http://www.open-lab.net/blog/?p=58450 2023-08-22T20:30:40Z 2022-12-15T16:00:00Z

As the global service economy grows, companies rely increasingly on contact centers to drive better customer experiences, increase customer satisfaction, and...]]>

As the global service economy grows, companies rely increasingly on contact centers to drive better customer experiences, increase customer satisfaction, and...

NVIDIA Speech AI Riva

As the global service economy grows, companies rely increasingly on contact centers to drive better customer experiences, increase customer satisfaction, and lower costs with increased efficiencies. Customer demand has increased far more rapidly than contact center employment ever could. Combined with the high agent churn rate, customer demand creates a need for more automated real-time customer��

]]> 0 Davide Onofrio <![CDATA[Introducing NVIDIA Riva: A GPU-Accelerated SDK for Developing Speech AI Applications]]> http://www.open-lab.net/blog/?p=17451 2023-05-22T22:12:28Z 2022-12-08T23:37:19Z

This post was updated in March 2023. Sign up for the latest Speech AI news from NVIDIA. Speech AI is used in a variety of applications, including contact...]]>

This post was updated in March 2023. Sign up for the latest Speech AI news from NVIDIA. Speech AI is used in a variety of applications, including contact...

riva-use-cases (2)

This post was updated in March 2023. Sign up for the latest Speech AI news from NVIDIA. Speech AI is used in a variety of applications, including contact centers�� agent assists for empowering human agents, voice interfaces for intelligent virtual assistants (IVAs), and live captioning in video conferencing. To support these features, speech AI technology includes automatic speech recognition��

]]> 3 Vinh Nguyen <![CDATA[Making an NVIDIA Riva ASR Service for a New Language]]> http://www.open-lab.net/blog/?p=50426 2024-08-28T14:49:34Z 2022-10-28T17:00:00Z

Speech AI is the ability of intelligent systems to communicate with users using a voice-based interface, which has become ubiquitous in everyday life. People...]]>

Speech AI is the ability of intelligent systems to communicate with users using a voice-based interface, which has become ubiquitous in everyday life. People...

automatic-speech-recognition-tech-blog-featured-image

Speech AI is the ability of intelligent systems to communicate with users using a voice-based interface, which has become ubiquitous in everyday life. People regularly interact with smart home devices, in-car assistants, and phones through speech. Speech interface quality has improved leaps and bounds in recent years, making them a much more pleasant, practical, and natural experience than just a��

]]> 4 Tanya Lenz <![CDATA[New Course: Get Started with Highly Accurate Custom ASR for Speech AI]]> http://www.open-lab.net/blog/?p=55846 2023-11-03T07:15:07Z 2022-10-24T16:30:00Z

Learn how to build, train, customize, and deploy a GPU-accelerated automatic speech recognition service with NVIDIA Riva in this self-paced course.]]>

Learn how to build, train, customize, and deploy a GPU-accelerated automatic speech recognition service with NVIDIA Riva in this self-paced course.

dli-sp-course-sept22-riva-asr-li-tw-2048x1024 (1)

Learn how to build, train, customize, and deploy a GPU-accelerated automatic speech recognition service with NVIDIA Riva in this self-paced course.

]]> 0 Gordana Neskovic <![CDATA[Just Released: New Updates to NVIDIA Riva]]> http://www.open-lab.net/blog/?p=54741 2023-06-12T08:56:12Z 2022-09-26T17:00:00Z

Build better GPU-accelerated Speech AI applications with the latest NVIDIA Riva updates, including enterprise support.]]>

Build better GPU-accelerated Speech AI applications with the latest NVIDIA Riva updates, including enterprise support.

convai-gtc22-fall-launchpad-yt-1920x1080

Build better GPU-accelerated Speech AI applications with the latest NVIDIA Riva updates, including enterprise support.

]]> 0 Dave Niewinski <![CDATA[Low-Code Building Blocks for Speech AI Robotics]]> http://www.open-lab.net/blog/?p=55065 2023-11-03T07:15:08Z 2022-09-22T18:33:00Z

When examining an intricate speech AI robotic system, it��s easy for developers to feel intimidated by its complexity. Arthur C. Clarke claimed, ��Any...]]>

When examining an intricate speech AI robotic system, it��s easy for developers to feel intimidated by its complexity. Arthur C. Clarke claimed, ��Any...

gtc22-fall-convai

When examining an intricate speech AI robotic system, it��s easy for developers to feel intimidated by its complexity. Arthur C. Clarke claimed, ��Any sufficiently advanced technology is indistinguishable from magic.�� From accepting natural-language commands to safely interacting in real-time with its environment and the humans around it, today��s speech AI robotics systems can perform tasks to��

]]> 0 Erik Pounds <![CDATA[New Languages, Enhanced Cybersecurity, and Medical AI Frameworks Unveiled at GTC]]> http://www.open-lab.net/blog/?p=54868 2023-05-23T23:52:54Z 2022-09-21T15:18:00Z

At GTC 2022, NVIDIA introduced enhancements to AI frameworks for building real-time speech AI applications, designing high-performing recommenders at scale,...]]>

At GTC 2022, NVIDIA introduced enhancements to AI frameworks for building real-time speech AI applications, designing high-performing recommenders at scale,...

ai-for-dev-gtcf22-ai-workflow

At GTC 2022, NVIDIA introduced enhancements to AI frameworks for building real-time speech AI applications, designing high-performing recommenders at scale, applying AI to cybersecurity challenges, creating AI-powered medical devices, and more. Showcased real-world, end-to-end AI frameworks highlighted the customers and partners leading the way in their industries and domains.

]]> 0 Sirisha Rella <![CDATA[Developing the Next Generation of Extended Reality Applications with Speech AI]]> http://www.open-lab.net/blog/?p=54831 2023-11-03T07:15:10Z 2022-09-14T16:00:00Z

Virtual reality (VR), augmented reality (AR), and mixed reality (MR) environments can feel incredibly real due to the physically immersive experience. Adding a...]]>

Virtual reality (VR), augmented reality (AR), and mixed reality (MR) environments can feel incredibly real due to the physically immersive experience. Adding a...

convai-gtc22-fall-promo-pack-xr-applications-1600x900

]]> 0 Sunil Kumar Jang Bahadur <![CDATA[Solving Automatic Speech Recognition Deployment Challenges]]> http://www.open-lab.net/blog/?p=54238 2023-06-12T09:00:47Z 2022-08-31T16:00:00Z

Successfully deploying an automatic speech recognition (ASR) application can be a frustrating experience. For example, it is difficult for an ASR system to...]]>

Successfully deploying an automatic speech recognition (ASR) application can be a frustrating experience. For example, it is difficult for an ASR system to...

speech-to-text-remixed

Successfully deploying an automatic speech recognition (ASR) application can be a frustrating experience. For example, it is difficult for an ASR system to correctly identify words while maintaining low latency, considering the many different dialects and pronunciations that exist. Sign up for the latest Data Science news. Get the latest announcements, notebooks, hands-on tutorials, events��

]]> 0 Rohil Bhargava <![CDATA[Building a Speech-Enabled AI Virtual Assistant with NVIDIA Riva on Amazon EC2]]> http://www.open-lab.net/blog/?p=50606 2023-03-14T18:54:05Z 2022-07-28T15:30:00Z

Speech AI can assist human agents in contact centers, power virtual assistants and digital avatars, generate live captioning in video conferencing, and much...]]>

Speech AI can assist human agents in contact centers, power virtual assistants and digital avatars, generate live captioning in video conferencing, and much... Figure illustrating a screenshot of an NVIDIA Riva sample virtual assistant application running on a GPU-powered AWS EC2 instance through a web browser.

Figure illustrating a screenshot of an NVIDIA Riva sample virtual assistant application running on a GPU-powered AWS EC2 instance through a web browser.

Speech AI can assist human agents in contact centers, power virtual assistants and digital avatars, generate live captioning in video conferencing, and much more. Under the hood, these voice-based technologies orchestrate a network of automatic speech recognition (ASR) and text-to-speech (TTS) pipelines to deliver intelligent, real-time responses. Sign up for the latest Data Science news.

]]> 3 Michelle Horton <![CDATA[Minerva CQ Deploys NVIDIA Riva Enterprise in the Energy Sector]]> http://www.open-lab.net/blog/?p=49708 2023-06-12T09:23:29Z 2022-06-30T21:00:00Z

Learn how NVIDIA Inception member Minerva CQ is using NVIDIA Riva to deliver faster, personalized experiences within a global EV charging and electric mobility...]]>

Learn how NVIDIA Inception member Minerva CQ is using NVIDIA Riva to deliver faster, personalized experiences within a global EV charging and electric mobility...

call-center-minvera-cq-webinar-2000x1125

Learn how NVIDIA Inception member Minerva CQ is using NVIDIA Riva to deliver faster, personalized experiences within a global EV charging and electric mobility company.

]]> 0 Siddharth Sharma <![CDATA[Build Speech AI in Multiple Languages and Train Large Language Models with the Latest from Riva and NeMo Framework]]> http://www.open-lab.net/blog/?p=45648 2023-06-12T20:54:30Z 2022-03-28T16:00:00Z

Major updates to Riva, an SDK for building speech AI applications, and a paid Riva Enterprise offering were announced at NVIDIA GTC 2022 last week. Several key...]]>

Major updates to Riva, an SDK for building speech AI applications, and a paid Riva Enterprise offering were announced at NVIDIA GTC 2022 last week. Several key... Graphical representation of automatic speech recognition for transcription, controllable text-to-speech, and natural language processing in a chatbot.

Graphical representation of automatic speech recognition for transcription, controllable text-to-speech, and natural language processing in a chatbot.

Major updates to Riva, an SDK for building speech AI applications, and a paid Riva Enterprise offering were announced at NVIDIA GTC 2022 last week. Several key updates to the NeMo framework, a framework for training Large Language Models, were also announced. Riva offers world-class accuracy for real-time automatic speech recognition (ASR) and text-to-speech (TTS) skills across multiple��

]]> 0 Michelle Horton <![CDATA[Insider��s Guide to GTC:?AR/VR, Rendering, Simulation, and Video Streaming]]> http://www.open-lab.net/blog/?p=45016 2024-08-28T18:14:10Z 2022-03-10T16:13:18Z

Join us at GTC, March 21-24, to explore the latest technology and research across AI, computer vision, data science, robotics, and more! With over 900...]]>

Join us at GTC, March 21-24, to explore the latest technology and research across AI, computer vision, data science, robotics, and more! With over 900...

gtc, 16-9, group 3

Join us at GTC, March 21-24, to explore the latest technology and research across AI, computer vision, data science, robotics, and more! With over 900 options to choose from, our NVIDIA experts put together some can��t-miss sessions to help get you started: How to Design Collaborative AR and VR worlds in Omniverse Omer Shapira, Senior Engineer, Omniverse��

]]> 0 Michelle Horton <![CDATA[Insider��s Guide to GTC: Computer Vision, NLP, Recommenders, and Robotics]]> http://www.open-lab.net/blog/?p=44927 2023-12-30T01:20:40Z 2022-03-09T19:32:53Z

Join us at GTC, March 21-24, to explore the latest technology and research across AI, computer vision, data science, robotics, and more! With over 900...]]>

Join us at GTC, March 21-24, to explore the latest technology and research across AI, computer vision, data science, robotics, and more! With over 900...

16-9 gtc

Join us at GTC, March 21-24, to explore the latest technology and research across AI, computer vision, data science, robotics, and more! With over 900 options to choose from, our NVIDIA experts put together some can��t-miss sessions to help get you started: Creating the Future: Creating the World��s Largest Synthetic Object Recognition Dataset for Industry��

]]> 0 Michelle Horton <![CDATA[Latest Releases and Resources: Feb. 3-10]]> http://www.open-lab.net/blog/?p=43737 2023-08-18T19:35:45Z 2022-02-10T19:11:50Z

Our weekly roundup covers the most recent software updates, learning resources, events, and notable news. Software releases Courses Webinars Software...]]>

Our weekly roundup covers the most recent software updates, learning resources, events, and notable news. Software releases Courses Webinars Software...

Top-Posts-2022

Our weekly roundup covers the most recent software updates, learning resources, events, and notable news. Software releases The redesigned nvCOMP 2.2.0 interface provides a single nvcompManagerBase object that can do compression and decompression. Users can now decompress nvcomp-compressed files without knowing how they were compressed. The interface also can��

]]> 0 Dana Sheahen <![CDATA[Get Started on NLP and Conversational AI with NVIDIA DLI Courses]]> http://www.open-lab.net/blog/?p=43575 2023-03-14T18:54:56Z 2022-02-07T20:27:15Z

This past year, NVIDIA announced several major breakthroughs in conversational AI for building and deploying automatic speech recognition (ASR), natural...]]>

This past year, NVIDIA announced several major breakthroughs in conversational AI for building and deploying automatic speech recognition (ASR), natural... Illustrated diagram showcasing conversational AI capabilities

Illustrated diagram showcasing conversational AI capabilities

This past year, NVIDIA announced several major breakthroughs in conversational AI for building and deploying automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech (TTS) applications. To get developers started with some quick examples in a cloud GPU-accelerated environment, NVIDIA Deep Learning Institute (DLI) is offering three fast, free, self-paced courses.

]]> 0 Gordana Neskovic <![CDATA[Create Speech AI Applications in Multiple Languages and Customize Text-to-Speech with Riva]]> http://www.open-lab.net/blog/?p=43993 2023-03-14T18:55:13Z 2022-02-07T17:00:00Z

This month, NVIDIA released world-class speech-to-text models for Spanish, German, and Russian in Riva, powering enterprises to deploy speech AI applications...]]>

This month, NVIDIA released world-class speech-to-text models for Spanish, German, and Russian in Riva, powering enterprises to deploy speech AI applications...

Riva-Multiple-Language

This month, NVIDIA released world-class speech-to-text models for Spanish, German, and Russian in Riva, powering enterprises to deploy speech AI applications globally. In addition, enterprises can now create expressive speech interfaces using Riva��s customizable text-to-speech pipeline. NVIDIA Riva is a GPU-accelerated speech AI SDK for developing real-time applications like live captioning��

]]> 6 Siddharth Sharma <![CDATA[ICYMI: New AI Tools and Technologies Announced at NVIDIA GTC Keynote]]> http://www.open-lab.net/blog/?p=39300 2023-03-22T01:16:48Z 2021-11-09T19:08:00Z

At NVIDIA GTC this November, new software tools were announced that help developers build real-time speech applications, optimize inference for a variety of...]]>

At NVIDIA GTC this November, new software tools were announced that help developers build real-time speech applications, optimize inference for a variety of...

ai-for-developers-nov-21-announcement-social-ai-sw-gtc-keynote-wrap-up-fi-social-graphics-2003402-tw-li-1000x600-r5

At NVIDIA GTC this November, new software tools were announced that help developers build real-time speech applications, optimize inference for a variety of use-cases, optimize open-source interoperability for recommender systems, and more. Watch the keynote from CEO, Jensen Huang, to learn about the latest NVIDIA breakthroughs. Today, NVIDIA unveiled a new version of NVIDIA Riva with a��

]]> 0 Siddharth Sharma <![CDATA[NVIDIA Announces Riva Speech AI and Large Language Modeling Software For Enterprise]]> http://www.open-lab.net/blog/?p=39420 2024-07-24T21:26:46Z 2021-11-09T19:06:00Z

NVIDIA recently unveiled new breakthroughs in NVIDIA Riva for speech AI, and NVIDIA NeMo for large-scale language modeling (LLM). Riva is a GPU-accelerated...]]>

NVIDIA recently unveiled new breakthroughs in NVIDIA Riva for speech AI, and NVIDIA NeMo for large-scale language modeling (LLM). Riva is a GPU-accelerated... Framework of workflow for NLP.

Framework of workflow for NLP.

NVIDIA recently unveiled new breakthroughs in NVIDIA Riva for speech AI, and NVIDIA NeMo for large-scale language modeling (LLM). Riva is a GPU-accelerated Speech AI SDK for enterprises to generate expressive human-like speech for their brand and virtual assistants. NeMo is an accelerated training framework for speech and NLU, that now has the capabilities to develop large-scale language models��

]]> 0 Disha Mehra <![CDATA[Building and Deploying Conversational AI Models Using NVIDIA TAO Toolkit]]> http://www.open-lab.net/blog/?p=24079 2023-03-22T01:16:50Z 2021-11-09T16:15:24Z

Sign up for the latest Speech AI news from NVIDIA. Conversational AI is a set of technologies enabling human-like interactions between humans and devices based...]]>

Sign up for the latest Speech AI news from NVIDIA. Conversational AI is a set of technologies enabling human-like interactions between humans and devices based...

conversational-ai-domains-sample-tasks

Sign up for the latest Speech AI news from NVIDIA. Conversational AI is a set of technologies enabling human-like interactions between humans and devices based on the most natural interfaces for us: speech and natural language. Systems based on conversational AI can understand commands by recognizing speech and text, translating on-the-fly between different languages��

]]> 2 Christopher Parisien <![CDATA[Building Transcription and Entity Recognition Apps Using NVIDIA Riva]]> http://www.open-lab.net/blog/?p=24076 2023-03-22T01:16:50Z 2021-11-09T16:15:08Z

In the past several months, many of us have grown accustomed to seeing our doctors over a video call. It��s certainly convenient, but after the call ends,...]]>

In the past several months, many of us have grown accustomed to seeing our doctors over a video call. It��s certainly convenient, but after the call ends,...

riva-multi-speaker-transcription

In the past several months, many of us have grown accustomed to seeing our doctors over a video call. It��s certainly convenient, but after the call ends, those important pieces of advice from your doctor start to slip away. What was that new medication I needed to take? Were there any side effects to watch out for? Conversational AI can help in building an application to transcribe speech as��

]]> 10 Nikhil Srihari <![CDATA[Creating Voice-based Virtual Assistants Using NVIDIA Riva and Rasa]]> http://www.open-lab.net/blog/?p=24085 2023-03-22T01:16:51Z 2021-11-09T16:14:54Z

Sign up for the latest Speech AI news from NVIDIA. Virtual assistants have become part of our daily lives. We ask virtual assistants almost anything that we...]]>

Sign up for the latest Speech AI news from NVIDIA. Virtual assistants have become part of our daily lives. We ask virtual assistants almost anything that we...

riva-voice-based-assistant

Sign up for the latest Speech AI news from NVIDIA. Virtual assistants have become part of our daily lives. We ask virtual assistants almost anything that we wonder about. In addition to providing convenience to our daily lives, virtual assistants are of tremendous help when it comes to enterprise applications. For example, we use online virtual agents to help navigate complex technical issues��

]]> 6 James Sohn <![CDATA[Developing a Question Answering Application Quickly Using NVIDIA Riva]]> http://www.open-lab.net/blog/?p=24073 2023-03-22T01:16:51Z 2021-11-09T16:14:38Z

Sign up for the latest Speech AI news from NVIDIA. There is a high chance that you have asked your smart speaker a question like, ��How tall is Mount...]]>

Sign up for the latest Speech AI news from NVIDIA. There is a high chance that you have asked your smart speaker a question like, ��How tall is Mount...

qa-jarvis

Sign up for the latest Speech AI news from NVIDIA. There is a high chance that you have asked your smart speaker a question like, ��How tall is Mount Everest?�� If you did, it probably said, ��Mount Everest is 29,032 feet above sea level.�� Have you ever wondered how it found an answer for you? Question answering (QA) is loosely defined as a system consisting of information retrieval (IR)��

]]> 6 Tanay Varshney <![CDATA[Speech Recognition: Deploying Models to Production]]> http://www.open-lab.net/blog/?p=39744 2023-12-30T01:51:25Z 2021-11-09T09:37:00Z

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Domain-Specific Audio...]]>

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Domain-Specific Audio...

riva-inference

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Domain-Specific Audio Transcriptions Using NVIDIA Riva. For part 2, see Speech Recognition: Customizing Models to Your Domain Using Transfer Learning. NVIDIA Riva is an AI speech SDK for developing real-time applications like transcription, virtual assistants��

]]> 0 Tanay Varshney <![CDATA[Speech Recognition: Customizing Models to Your Domain Using Transfer Learning]]> http://www.open-lab.net/blog/?p=39742 2023-03-22T01:16:53Z 2021-11-09T09:36:00Z

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Transcriptions Using...]]>

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Transcriptions Using...

part-2-featured-image

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Transcriptions Using NVIDIA Riva. For part 3, see Speech Recognition: Deploying Models to Production. Creating a new AI deep learning model from scratch is an extremely time�C and resource-intensive process. A common solution to this problem is to employ��

]]> 0 Sirisha Rella <![CDATA[Speech Recognition: Generating Accurate Domain-Specific Audio Transcriptions Using NVIDIA Riva]]> http://www.open-lab.net/blog/?p=39715 2025-01-23T19:24:23Z 2021-11-09T09:35:00Z

This post is part of a series about generating accurate speech transcription. For part 2, see Speech Recognition: Customizing Models to Your Domain Using...]]>

This post is part of a series about generating accurate speech transcription. For part 2, see Speech Recognition: Customizing Models to Your Domain Using...

Riva-ASR-DevBlog-Feature-Image-1000x600-1

This post is part of a series about generating accurate speech transcription. For part 2, see Speech Recognition: Customizing Models to Your Domain Using Transfer Learning. For part 3, see Speech Recognition: Deploying Models to Production. Every day millions of audio minutes are produced across several industries such as Telecommunications, Finance, and Unified Communications as a Service��

]]> 2 Piyush Modi <![CDATA[World��s Largest Manufacturing Players Tapping NVIDIA AI Platform for Factory of the Future]]> http://www.open-lab.net/blog/?p=39268 2022-08-21T23:52:56Z 2021-11-02T19:00:00Z

Soon, the industrial internet will have hundreds of billions of connected industrial assets continuously operating at computer speed. This will result in large...]]>

Soon, the industrial internet will have hundreds of billions of connected industrial assets continuously operating at computer speed. This will result in large... Graphic of people working in an industrial line setting.

Graphic of people working in an industrial line setting.

Soon, the industrial internet will have hundreds of billions of connected industrial assets continuously operating at computer speed. This will result in large amounts of data from shop-floor machines and sensors. Analyzing operations data to predict operational anomalies, machine failures, and product quality, while improving factory floor operations with industrial AI could yield productivity��

]]> 1 Gordana Neskovic <![CDATA[Improving Real-Time Communication Experiences with NVIDIA Maxine]]> http://www.open-lab.net/blog/?p=39258 2023-11-02T20:14:10Z 2021-10-28T16:00:00Z

The audio and video quality of real-time communication applications such as virtual collaboration and content creation applications is the true gauge of...]]>

The audio and video quality of real-time communication applications such as virtual collaboration and content creation applications is the true gauge of...

Improve-real-time-communications-feature

The audio and video quality of real-time communication applications such as virtual collaboration and content creation applications is the true gauge of users�� real-time communication experience. They rely heavily on network bandwidth and user equipment quality. Narrow network bandwidth and low-quality equipment produce unstable and noisy audio and video outputs. This problem is often��

]]> 0 Dana Sheahen <![CDATA[Upcoming DLI Training: Learn How to Build Conversational AI Applications]]> http://www.open-lab.net/blog/?p=38658 2023-03-14T18:56:46Z 2021-10-15T21:16:38Z

As the world continues to evolve and become more digital, conversational AI is increasingly used as a means for automation. This technology has been shown to...]]>

As the world continues to evolve and become more digital, conversational AI is increasingly used as a means for automation. This technology has been shown to...

As the world continues to evolve and become more digital, conversational AI is increasingly used as a means for automation. This technology has been shown to improve customer experience and efficiency, across various industries and applications. The NVIDIA Deep Learning Institute is hosting a workshop on how to build a conversational AI service using the NVIDIA Riva framework.

]]> 0 Michelle Horton <![CDATA[Inception Spotlight: Supercharging Synthetic Speech with Resemble AI]]> http://www.open-lab.net/blog/?p=36414 2023-11-03T07:15:13Z 2021-08-30T16:49:06Z

Deep learning is proving to be a powerful tool when it comes to high-quality synthetic speech development and customization. A Toronto-based startup, and NVIDIA...]]>

Deep learning is proving to be a powerful tool when it comes to high-quality synthetic speech development and customization. A Toronto-based startup, and NVIDIA...

Resemble_AI

Deep learning is proving to be a powerful tool when it comes to high-quality synthetic speech development and customization. A Toronto-based startup, and NVIDIA Inception member, Resemble AI is upping the stakes with a new generative voice tool able to create high-quality synthetic AI Voices. The technology can generate cross-lingual and naturally speaking voices in over 50 of the most��

]]> 0 Jane Polak Scowcroft <![CDATA[NVIDIA and Mozilla Release Common Voice Dataset, Surpassing 13,000 Hours for the First Time]]> http://www.open-lab.net/blog/?p=35561 2022-08-21T23:52:23Z 2021-07-30T15:00:00Z

NVIDIA and Mozilla are proud to announce the latest release of the Common Voice dataset, with over 13,000 hours of crowd-sourced speech data, and adding another...]]>

NVIDIA and Mozilla are proud to announce the latest release of the Common Voice dataset, with over 13,000 hours of crowd-sourced speech data, and adding another...

CV_social_red blue_green_x1_1000x600

NVIDIA and Mozilla are proud to announce the latest release of the Common Voice dataset, with over 13,000 hours of crowd-sourced speech data, and adding another 16 languages to the corpus. Common Voice is the world��s largest open data voice dataset and designed to democratize voice technology. It is used by researchers, academics, and developers around the world.

]]> 0 Joanne Chang <![CDATA[Fast-Track Production AI with Pretrained Models and NVIDIA TAO Toolkit 3.0]]> http://www.open-lab.net/blog/?p=33650 2022-08-21T23:52:02Z 2021-06-24T13:00:00Z

Today, NVIDIA announced new pretrained models and the general availability of TAO Toolkit 3.0, a core component of the NVIDIA Train, Adapt, and Optimize (TAO)...]]>

Today, NVIDIA announced new pretrained models and the general availability of TAO Toolkit 3.0, a core component of the NVIDIA Train, Adapt, and Optimize (TAO)...

tao_stack

Today, NVIDIA announced new pretrained models and the general availability of TAO Toolkit 3.0, a core component of the NVIDIA Train, Adapt, and Optimize (TAO) platform-guided workflow for creating AI. The new release includes a variety of highly accurate and performant pretrained models in computer vision and conversational AI, as well as a set of powerful productivity features that boost AI��

]]> 0 Oleksii Kuchaiev <![CDATA[Accelerating Conversational AI Research with New Cutting-Edge Neural Networks and Features from NeMo 1.0]]> http://www.open-lab.net/blog/?p=32233 2023-02-10T22:26:14Z 2021-06-08T16:00:00Z

NVIDIA NeMo is a conversational AI toolkit built for researchers working on automatic speech recognition (ASR), natural language processing (NLP), and...]]>

NVIDIA NeMo is a conversational AI toolkit built for researchers working on automatic speech recognition (ASR), natural language processing (NLP), and...

nemo-diagrams

NVIDIA NeMo is a conversational AI toolkit built for researchers working on automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech synthesis (TTS). The primary objective of NeMo is to help researchers from industry and academia to reuse prior work (code and pretrained models and make it easier to create new conversational AI models. NeMo is an open-source project��

]]> 6 Sirisha Rella <![CDATA[NVIDIA Accelerates Conversational AI from Research to Production with Latest Updates in NVIDIA NeMo and NVIDIA Riva]]> http://www.open-lab.net/blog/?p=32530 2022-08-21T23:51:51Z 2021-06-04T18:04:00Z

NVIDIA recently released NVIDIA Riva with world-class speech recognition capability for enterprises to generate highly accurate transcriptions and NVIDIA NeMo...]]>

NVIDIA recently released NVIDIA Riva with world-class speech recognition capability for enterprises to generate highly accurate transcriptions and NVIDIA NeMo...

Jarvis ASR_NeMo1.0 Featured Image 1000x600

NVIDIA recently released NVIDIA Riva with world-class speech recognition capability for enterprises to generate highly accurate transcriptions and NVIDIA NeMo 1.0, which includes new state-of-the-art speech and language models for democratizing and accelerating conversational AI research. NVIDIA Riva world-class speech recognition is an out-of-the-box speech service that can be easily��

]]> 0 Siddharth Sharma <![CDATA[ICYMI: New AI Tools and Technologies Announced at GTC 2021 Keynote]]> http://www.open-lab.net/blog/?p=30063 2024-10-28T19:06:39Z 2021-04-12T19:39:00Z

At GTC 2021, NVIDIA announced new software tools to help developers build optimized conversational AI, recommender, and video solutions. Watch the keynote from...]]>

At GTC 2021, NVIDIA announced new software tools to help developers build optimized conversational AI, recommender, and video solutions. Watch the keynote from...

NVIDIA-GTC KV Thumbnail 1000x600

At GTC 2021, NVIDIA announced new software tools to help developers build optimized conversational AI, recommender, and video solutions. Watch the keynote from CEO, Jensen Huang, for insights on all of the latest GPU technologies. Today NVIDIA announced major conversational AI capabilities in NVIDIA Riva that will help enterprises build engaging and accurate applications for their��

]]> 0 Sirisha Rella <![CDATA[Announcing Megatron for Training Trillion Parameter Models and NVIDIA Riva Availability]]> http://www.open-lab.net/blog/?p=30236 2023-12-30T00:45:19Z 2021-04-12T19:38:00Z

Conversational AI is opening new ways for enterprises to interact with customers in every industry using applications like real-time transcription, translation,...]]>

Conversational AI is opening new ways for enterprises to interact with customers in every industry using applications like real-time transcription, translation,...

NVIDIA Megatron featured

Conversational AI is opening new ways for enterprises to interact with customers in every industry using applications like real-time transcription, translation, chatbots, and virtual assistants. Building domain-specific interactive applications requires state-of-the-art models, optimizations for real-time performance, and tools to adapt those models with your data. This week at GTC��

]]> 0 Aidan Campbell <![CDATA[Integrating with Telephone Networks to Enable Real-Time AI Services]]> http://www.open-lab.net/blog/?p=25272 2022-08-21T23:41:11Z 2021-03-29T23:50:17Z

Many of you may not recognize my company, Ribbon Communications. We are best known for building and securing large telecom networks for communication service...]]>

Many of you may not recognize my company, Ribbon Communications. We are best known for building and securing large telecom networks for communication service...

ribbon feature image 250x150

Many of you may not recognize my company, Ribbon Communications. We are best known for building and securing large telecom networks for communication service providers (also known as phone companies). However, there��s a good chance that in the next day or two, you��ll place a call that traverses a piece of our gear somewhere in the world. In addition to service providers��

]]> 0 Brad Nemire <![CDATA[New on the NVIDIA NGC Catalog: Riva AI, Updates to TensorFlow and PyTorch Containers, plus a New HPC Quantum Espresso Container]]> https://news.www.open-lab.net/?p=19568 2024-10-28T18:21:58Z 2021-03-11T19:35:35Z

The NVIDIA NGC catalog is a hub for GPU-optimized deep learning, machine learning and high-performance computing (HPC) applications. With highly performant...]]>

The NVIDIA NGC catalog is a hub for GPU-optimized deep learning, machine learning and high-performance computing (HPC) applications. With highly performant...

NGC home featured

The NVIDIA NGC catalog is a hub for GPU-optimized deep learning, machine learning and high-performance computing (HPC) applications. With highly performant software containers, pre-trained models, industry specific SDKs and Helm Charts, the content available on the catalog helps you simplify and accelerate your end-to-end workflows. The NVIDIA NGC team works closely with our internal and��

]]> 0 Brad Nemire <![CDATA[NVIDIA Releases Riva 1.0 Beta for Building Real-Time Conversational AI Services]]> https://news.www.open-lab.net/?p=19332 2023-02-13T19:00:37Z 2021-02-25T18:00:00Z

Today, NVIDIA released the Riva 1.0 Beta which includes an end-to-end workflow for building and deploying real-time conversational AI apps, such as...]]>

Today, NVIDIA released the Riva 1.0 Beta which includes an end-to-end workflow for building and deploying real-time conversational AI apps, such as...

Jarvis Beta Featured Image

Today, NVIDIA released the Riva 1.0 Beta which includes an end-to-end workflow for building and deploying real-time conversational AI apps, such as transcription, virtual assistants and chatbots. Riva is an accelerated SDK for multimodal conversational AI services that delivers real-time performance on NVIDIA GPUs. This release of Riva includes new pretrained models for conversation AI and��

]]> 0 Phillip Singh <![CDATA[New AI Technologies Announced at GTC 2020 Keynote]]> https://news.www.open-lab.net/?p=18310 2024-10-28T18:43:12Z 2020-10-05T13:01:00Z

At GTC 2020, NVIDIA announced updates to 80 SDKs, including tools to help you build AI-powered video streaming solutions, conversational AI, recommendation...]]>

At GTC 2020, NVIDIA announced updates to 80 SDKs, including tools to help you build AI-powered video streaming solutions, conversational AI, recommendation...

nvidia-maxine-ai-compression-sfg-251-151-dtm@2x

At GTC 2020, NVIDIA announced updates to 80 SDKs, including tools to help you build AI-powered video streaming solutions, conversational AI, recommendation systems, and more. Today, we announced NVIDIA Maxine, a cloud-native video streaming AI platform for services such as video conferencing. It includes state-of-the-art AI models and optimized pipelines that can run several��

]]> 0 Raghav Mani <![CDATA[Speeding Up Development of Speech and Language Models with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=17649 2023-03-22T01:09:09Z 2020-10-05T13:00:00Z

[stextbox id="info"]This is an updated version of Neural Modules for Fast Development of Speech and Language Models. This post upgrades the NeMo diagram with...]]>

[stextbox id="info"]This is an updated version of Neural Modules for Fast Development of Speech and Language Models. This post upgrades the NeMo diagram with...

NeMo-featured-image

This is an updated version of Neural Modules for Fast Development of Speech and Language Models. This post upgrades the NeMo diagram with PyTorch and PyTorch Lightning support and updates the tutorial with the new code base. As a researcher building state-of-the-art speech and language models, you must be able to quickly experiment with novel network architectures.

]]> 0 Dominique LaSalle <![CDATA[Getting a Real Time Factor Over 60 for Text-To-Speech Services Using NVIDIA Riva]]> http://www.open-lab.net/blog/?p=18263 2024-07-24T21:26:47Z 2020-06-24T21:03:04Z

NVIDIA Riva is an application framework that provides several pipelines for accomplishing conversational AI tasks. Generating high-quality, natural-sounding...]]>

NVIDIA Riva is an application framework that provides several pipelines for accomplishing conversational AI tasks. Generating high-quality, natural-sounding... Architecture diagram for Riva server.

Architecture diagram for Riva server.

NVIDIA Riva is an application framework that provides several pipelines for accomplishing conversational AI tasks. Generating high-quality, natural-sounding speech from text with low latency, also known as text-to-speech (TTS), can be one of the most computationally challenging of those tasks. In this post, we focus on optimizations made to a TTS pipeline in Riva, as shown in Figure 1.

]]> 0 Nefi Alarcon <![CDATA[New AI Technologies Introduced at GTC 2020 Keynote]]> https://news.www.open-lab.net/?p=16960 2023-12-29T22:22:58Z 2020-05-14T13:00:00Z

At GTC 2020, NVIDIA announced and shipped a range of new AI SDKs, enabling developers to support the new Ampere architecture. For the first time, developers...]]>

At GTC 2020, NVIDIA announced and shipped a range of new AI SDKs, enabling developers to support the new Ampere architecture. For the first time, developers...

conversational_aI_2020_Feature

At GTC 2020, NVIDIA announced and shipped a range of new AI SDKs, enabling developers to support the new Ampere architecture. For the first time, developers have the tools to build end-to-end deep learning-based pipelines for conversational AI and recommendation systems. Today, NVIDIA announced Riva, a fully accelerated application framework building multimodal conversational AI services.

]]> 0 ��˳��97caoporen��