Speech AI – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-23T19:27:29Z http://www.open-lab.net/blog/feed/ Michelle Horton <![CDATA[Top Conversational AI Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96694 2025-03-06T19:26:36Z 2025-03-04T19:00:00Z Learn how to accelerate the full pipeline, from multilingual speech recognition and translation to generative AI and speech synthesis.]]> Learn how to accelerate the full pipeline, from multilingual speech recognition and translation to generative AI and speech synthesis.An illustration of a person using an AI agent.

Learn how to accelerate the full pipeline, from multilingual speech recognition and translation to generative AI and speech synthesis.

Source

]]>
0
Anu Srivastava <![CDATA[Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs]]> http://www.open-lab.net/blog/?p=96519 2025-04-23T02:39:30Z 2025-02-26T22:05:00Z Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...]]> Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...An image of a phone with a chatbot dialog on the screen but also showing the inside of the phone.

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical for the current resource constraints that many companies have. The rise of small language models (SLMs) bridge quality and cost by creating models with a smaller resource footprint. SLMs are a subset of language models that tend to��

Source

]]>
0
Sven Chilton <![CDATA[Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT]]> http://www.open-lab.net/blog/?p=95339 2025-04-23T02:42:38Z 2025-02-20T18:54:48Z NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a...]]> NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a...Two people sitting at their desks with icons for speech translation in the background.

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a collection of GPU-accelerated speech and translation AI microservices for ASR, TTS, and NMT, support English-Spanish and English-Japanese code-switching ASR models based on the Conformer architecture, along with a model supporting multiple��

Source

]]>
0
Sven Chilton <![CDATA[Quickly Voice Your Apps with NVIDIA NIM Microservices for Speech and Translation]]> http://www.open-lab.net/blog/?p=89142 2024-09-19T20:17:19Z 2024-09-18T22:48:43Z NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models...]]> NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models...

NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and workstations. NIM microservices for speech and translation are now available. The new speech and translation microservices leverage NVIDIA Riva and provide automatic speech recognition (ASR)��

Source

]]>
0
Sang-gil Lee <![CDATA[Achieving State-of-the-Art Zero-Shot Waveform Audio Generation across Audio Types]]> http://www.open-lab.net/blog/?p=88329 2024-09-19T19:34:33Z 2024-09-05T20:30:00Z Stunning audio content is an essential component of virtual worlds. Audio generative AI plays a key role in creating this content, and NVIDIA is continuously...]]> Stunning audio content is an essential component of virtual worlds. Audio generative AI plays a key role in creating this content, and NVIDIA is continuously...

Stunning audio content is an essential component of virtual worlds. Audio generative AI plays a key role in creating this content, and NVIDIA is continuously pushing the limits in this field of research. BigVGAN, developed in collaboration with the NVIDIA Applied Deep Learning Research and NVIDIA NeMo teams, is a generative AI model specialized in audio waveform synthesis that achieves state-of��

Source

]]>
0
Sofia Kostandian <![CDATA[Developing Robust Georgian Automatic Speech Recognition with FastConformer Hybrid Transducer CTC BPE]]> http://www.open-lab.net/blog/?p=85835 2024-08-22T18:25:43Z 2024-08-05T16:52:11Z Building an effective automatic speech recognition (ASR) model for underrepresented languages presents unique challenges due to limited data resources.  In...]]> Building an effective automatic speech recognition (ASR) model for underrepresented languages presents unique challenges due to limited data resources.  In...Image of two people sitting in their cubicles with speech recognition visualizations in the background.

Building an effective automatic speech recognition (ASR) model for underrepresented languages presents unique challenges due to limited data resources. In this post, I discuss the best practices for preparing the dataset, configuring the model, and training it effectively. I also discuss the evaluation metrics and the encountered challenges. By following these practices��

Source

]]>
0
Subhankar Ghosh <![CDATA[Addressing Hallucinations in Speech Synthesis LLMs with the NVIDIA NeMo T5-TTS Model]]> http://www.open-lab.net/blog/?p=84524 2024-07-25T18:19:15Z 2024-07-02T20:00:00Z NVIDIA NeMo has released the T5-TTS model, a significant advancement in text-to-speech (TTS) technology. Based on large language models (LLMs), T5-TTS produces...]]> NVIDIA NeMo has released the T5-TTS model, a significant advancement in text-to-speech (TTS) technology. Based on large language models (LLMs), T5-TTS produces...

NVIDIA NeMo has released the T5-TTS model, a significant advancement in text-to-speech (TTS) technology. Based on large language models (LLMs), T5-TTS produces more accurate and natural-sounding speech. By improving alignment between text and audio, T5-TTS eliminates hallucinations such as repeated spoken words and skipped text. Additionally, T5-TTS makes up to 2x fewer word pronunciation errors��

Source

]]>
0
Elena Rastorgueva <![CDATA[New Standard for Speech Recognition and Translation from the NVIDIA NeMo Canary Model]]> http://www.open-lab.net/blog/?p=80661 2024-08-06T17:19:16Z 2024-04-18T20:09:33Z NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises. The NeMo team...]]> NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises. The NeMo team...Decorative image of text and speech recognition processes encircling the globe.

NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises. The NeMo team just released?Canary, a multilingual model that transcribes speech in English, Spanish, German, and French with punctuation and capitalization. Canary also provides bi-directional translation, between English and the three other supported��

Source

]]>
1
Hainan Xu <![CDATA[Turbocharge ASR Accuracy and Speed with NVIDIA NeMo Parakeet-TDT]]> http://www.open-lab.net/blog/?p=80732 2024-08-12T16:06:21Z 2024-04-18T20:03:54Z NVIDIA NeMo, an end-to-end platform for developing multimodal generative AI models at scale anywhere��on any cloud and on-premises��recently released...]]> NVIDIA NeMo, an end-to-end platform for developing multimodal generative AI models at scale anywhere��on any cloud and on-premises��recently released...

NVIDIA NeMo, an end-to-end platform for developing multimodal generative AI models at scale anywhere��on any cloud and on-premises��recently released Parakeet-TDT. This new addition to the?NeMo ASR Parakeet model family boasts better accuracy and 64% greater speed over the previously best model, Parakeet-RNNT-1.1B. This post explains Parakeet-TDT and how to use it to generate highly accurate��

Source

]]>
0
Somshubra Majumdar <![CDATA[Pushing the Boundaries of Speech Recognition with NVIDIA NeMo Parakeet ASR Models]]> http://www.open-lab.net/blog/?p=80564 2024-08-12T16:07:43Z 2024-04-18T20:03:07Z NVIDIA NeMo, an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises��released the...]]> NVIDIA NeMo, an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises��released the...Image of two people sitting in their cubicles with speech recognition visualizations in the background.

NVIDIA NeMo, an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises��released the Parakeet family of automatic speech recognition (ASR) models. These state-of-the-art ASR models, developed in collaboration with Suno.ai, transcribe spoken English with exceptional accuracy. This post details Parakeet ASR models that are��

Source

]]>
0
Gordana Neskovic <![CDATA[NVIDIA Speech and Translation AI Models Set Records for Speed and Accuracy]]> http://www.open-lab.net/blog/?p=79365 2024-08-12T16:09:12Z 2024-03-19T16:00:00Z Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition...]]> Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition...

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition (ASR) family of models and the NVIDIA Canary multilingual, multitask ASR and translation model currently top the Hugging Face Open ASR Leaderboard. In addition, a multilingual P-Flow-based text-to-speech (TTS) model won the LIMMITS ��24��

Source

]]>
0
Tanya Lenz <![CDATA[Event: Speech and Generative AI Developer Day at NVIDIA GTC 2024]]> http://www.open-lab.net/blog/?p=78609 2024-03-07T19:29:14Z 2024-02-29T21:00:00Z Learn how to build a RAG-powered application with a human voice interface at NVIDIA GTC 2024 Speech and Generative AI Developer Day.?]]> Learn how to build a RAG-powered application with a human voice interface at NVIDIA GTC 2024 Speech and Generative AI Developer Day.?

Learn how to build a RAG-powered application with a human voice interface at NVIDIA GTC 2024 Speech and Generative AI Developer Day.

Source

]]>
0
Piotr ?elasko <![CDATA[New Support for Dutch and Persian Released by NVIDIA NeMo ASR]]> http://www.open-lab.net/blog/?p=76636 2024-02-08T18:52:04Z 2024-01-16T18:29:16Z Breaking barriers in speech recognition, NVIDIA NeMo proudly presents pretrained models tailored for Dutch and Persian��languages often overlooked in the AI...]]> Breaking barriers in speech recognition, NVIDIA NeMo proudly presents pretrained models tailored for Dutch and Persian��languages often overlooked in the AI...Person sitting at a desk having a conversation with a speech ai chatbot.

Breaking barriers in speech recognition, NVIDIA NeMo proudly presents pretrained models tailored for Dutch and Persian��languages often overlooked in the AI landscape. These models leverage the recently introduced FastConformer architecture and were trained simultaneously with CTC and transducer objectives to maximize each model��s accuracy. Automatic speech recognition (ASR) is a��

Source

]]>
1
Pawe? Budzianowski <![CDATA[Enhancing Phone Customer Service with ASR Customization]]> http://www.open-lab.net/blog/?p=75584 2024-01-25T18:17:37Z 2024-01-09T17:00:00Z At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and...]]> At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and...Decorative image.

At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and automate customer service interactions over the phone, companies must solve the unique challenge of gaining a caller��s trust through qualities such as understanding, empathy, and clarity. Telephony-bound voice is inherently challenging��

Source

]]>
0
Yasmina Benkhoui <![CDATA[Spotlight: Convai Reinvents Non-Playable Character Interactions]]> http://www.open-lab.net/blog/?p=76184 2024-01-25T18:17:40Z 2024-01-08T16:30:00Z Convai is a versatile developer platform for designing characters with advanced multimodal perception abilities. These characters are designed to integrate...]]> Convai is a versatile developer platform for designing characters with advanced multimodal perception abilities. These characters are designed to integrate...

Convai is a versatile developer platform for designing characters with advanced multimodal perception abilities. These characters are designed to integrate seamlessly into both the virtual and real worlds. Whether you��re a creator, game designer, or developer, Convai enables you to quickly modify a non-playable character (NPC), from backstory and knowledge to voice and personality.

Source

]]>
0
Ike Nnoli <![CDATA[Create Lifelike Avatars with AI Animation and Speech Features in NVIDIA ACE]]> http://www.open-lab.net/blog/?p=74159 2024-11-20T23:02:47Z 2023-12-04T22:00:00Z NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered...]]> NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered...

NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered avatars and digital humans. These latest animation and speech capabilities enable more natural conversations and emotional expressions. Developers can now easily implement and scale intelligent avatars across applications using new��

Source

]]>
0
Mohamed Elshenawy <![CDATA[Boost Meeting Productivity with AI-Powered Note-Taking and Summarization]]> http://www.open-lab.net/blog/?p=73964 2023-12-14T19:27:34Z 2023-11-29T21:00:00Z Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and...]]> Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and...

Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and problem-solving. And they further strategic goals and planning. Yet, leading meetings that accomplish these goals��especially those involving cross-functional teams and external participants��can be challenging. A unique blend of people��

Source

]]>
0
Belen Tegegn <![CDATA[Video: Exploring Speech AI from Research to Practical Production Applications]]> http://www.open-lab.net/blog/?p=72433 2023-11-16T19:16:46Z 2023-11-07T16:07:22Z The integration of speech and translation AI into our daily lives is rapidly reshaping our interactions, from virtual assistants to call centers and augmented...]]> The integration of speech and translation AI into our daily lives is rapidly reshaping our interactions, from virtual assistants to call centers and augmented...Decorative image of groups of people using speech AI in different ways standing around a globe.

The integration of speech and translation AI into our daily lives is rapidly reshaping our interactions, from virtual assistants to call centers and augmented reality experiences. Speech AI Day provided valuable insights into the latest advancements in speech AI, showcasing how this technology addresses real-world challenges. In this first of three Speech AI Day sessions��

Source

]]>
0
Sven Chilton <![CDATA[How to Deploy NVIDIA Riva Speech and Translation AI in the Public Cloud]]> http://www.open-lab.net/blog/?p=69702 2023-10-20T18:13:34Z 2023-08-29T17:00:00Z From start-ups to large enterprises, businesses use cloud marketplaces to find the new solutions needed to quickly transform their businesses. Cloud...]]> From start-ups to large enterprises, businesses use cloud marketplaces to find the new solutions needed to quickly transform their businesses. Cloud...Image of two boxes with text, in two languages, with speech icons joining them to a central box symbolizing translation. The English language box displays,

From start-ups to large enterprises, businesses use cloud marketplaces to find the new solutions needed to quickly transform their businesses. Cloud marketplaces are online storefronts where customers can purchase software and services with flexible billing models, including pay-as-you-go, subscriptions, and privately negotiated offers. Businesses further benefit from committed spending at��

Source

]]>
0
Michelle Horton <![CDATA[Event: Speech AI Day]]> http://www.open-lab.net/blog/?p=69814 2023-08-24T19:18:11Z 2023-08-21T19:24:00Z On Sept. 20, join experts from leading companies at NVIDIA-hosted Speech AI Day.]]> On Sept. 20, join experts from leading companies at NVIDIA-hosted Speech AI Day.Speech AI Day promo asset, showing an illustration of several people from different countries standing around a globe saying hello in their language.

On Sept. 20, join experts from leading companies at NVIDIA-hosted Speech AI Day.

Source

]]>
0
Edresson Casanova <![CDATA[Overview of Zero-Shot Multi-Speaker TTS Systems: Top Q&As]]> http://www.open-lab.net/blog/?p=65974 2023-07-13T19:00:32Z 2023-06-22T17:28:24Z The Speech AI Summit is an annual conference that brings together experts in the field of AI and speech technology to discuss the latest industry trends and...]]> The Speech AI Summit is an annual conference that brings together experts in the field of AI and speech technology to discuss the latest industry trends and...Title slide for Speech AI Summit session.

The Speech AI Summit is an annual conference that brings together experts in the field of AI and speech technology to discuss the latest industry trends and advancements. This post summarizes the top questions asked during Overview of Zero-Shot Multi-Speaker TTS System, a recorded talk from the 2022 summit featuring Coqui.ai. Text-to-speech (TTS) systems have significantly advanced in��

Source

]]>
0
Caroline Gottlieb <![CDATA[Unlocking Speech AI Technology for Global Language Users: Top Q&As]]> http://www.open-lab.net/blog/?p=66216 2023-11-03T07:15:00Z 2023-06-06T17:00:00Z Voice-enabled technology is becoming ubiquitous. But many are being left behind by an anglocentric and demographically biased algorithmic world. Mozilla Common...]]> Voice-enabled technology is becoming ubiquitous. But many are being left behind by an anglocentric and demographically biased algorithmic world. Mozilla Common...

Voice-enabled technology is becoming ubiquitous. But many are being left behind by an anglocentric and demographically biased algorithmic world. Mozilla Common Voice (MCV) and NVIDIA are collaborating to change that by partnering on a public crowdsourced multilingual speech corpus��now the largest of its kind in the world��and open-source pretrained models. It is now easier than ever before to��

Source

]]>
0
Swaroop Kumar <![CDATA[Enhancing Customer Experience in Telecom with NVIDIA Customized Speech AI]]> http://www.open-lab.net/blog/?p=65421 2023-10-30T23:24:55Z 2023-05-30T15:00:00Z The telecom sector is transforming how communication happens. Striving to provide reliable, uninterrupted service, businesses are tackling the challenge of...]]> The telecom sector is transforming how communication happens. Striving to provide reliable, uninterrupted service, businesses are tackling the challenge of...Image of a chatbot as the interface between customers, with speech bubbles.

The telecom sector is transforming how communication happens. Striving to provide reliable, uninterrupted service, businesses are tackling the challenge of delivering an optimal customer experience. This optimal customer experience is something many long-time customers of large telecom service providers do not have. Take Jack, for example. His call was on hold for 10 minutes��

Source

]]>
0
Michelle Horton <![CDATA[Webinar: Empower Telco Contact Center Agents with Multi-Language Speech-AI-Customized Agent Assists]]> http://www.open-lab.net/blog/?p=64106 2023-08-18T20:52:27Z 2023-05-16T16:00:00Z Join Infosys, NVIDIA, and Quantiphi on June 7 to learn how to use speech and translation AI to improve agent-assist solutions in multiple languages.]]> Join Infosys, NVIDIA, and Quantiphi on June 7 to learn how to use speech and translation AI to improve agent-assist solutions in multiple languages.

Join Infosys, NVIDIA, and Quantiphi on June 7 to learn how to use speech and translation AI to improve agent-assist solutions in multiple languages.

Source

]]>
0
Angie Lee <![CDATA[Explainer: What Is Agent Assist?]]> http://www.open-lab.net/blog/?p=64129 2023-06-13T17:35:33Z 2023-05-04T18:00:00Z Agent-assist technology uses AI and ML to provide facts and make real-time suggestions that help human agents across retail, telecom, and other industries...]]> Agent-assist technology uses AI and ML to provide facts and make real-time suggestions that help human agents across retail, telecom, and other industries...Image of headphones with speech bubbles saying hello in five languages.

Agent-assist technology uses AI and ML to provide facts and make real-time suggestions that help human agents across retail, telecom, and other industries conduct conversations with customers.

Source

]]>
0
Kristen Rumley <![CDATA[How Speech Recognition Improves Customer Service in Telecommunications]]> http://www.open-lab.net/blog/?p=63789 2023-11-03T07:15:01Z 2023-05-02T16:00:00Z The telecommunication industry has seen a proliferation of AI-powered technologies in recent years, with speech recognition and translation leading the charge....]]> The telecommunication industry has seen a proliferation of AI-powered technologies in recent years, with speech recognition and translation leading the charge....

The telecommunication industry has seen a proliferation of AI-powered technologies in recent years, with speech recognition and translation leading the charge. Multi-lingual AI virtual assistants, digital humans, chatbots, agent assists, and audio transcription are technologies that are revolutionizing the telco industry. Businesses are implementing AI in call centers to address incoming requests��

Source

]]>
0
Michelle Horton <![CDATA[Webinar: How Telcos Transform Customer Experiences with Conversational AI]]> http://www.open-lab.net/blog/?p=64118 2023-08-18T20:52:43Z 2023-05-02T16:00:00Z Join Infosys, Quantiphi, Talkmap, and NVIDIA on May 31 for a live webinar to learn how telecommunications companies are using AI to improve operational...]]> Join Infosys, Quantiphi, Talkmap, and NVIDIA on May 31 for a live webinar to learn how telecommunications companies are using AI to improve operational...

Join Infosys, Quantiphi, Talkmap, and NVIDIA on May 31 for a live webinar to learn how telecommunications companies are using AI to improve operational efficiency and enhance customer engagement.

Source

]]>
0
Kristen Rumley <![CDATA[Workshop: How to Enable Your Product with Voice Interface]]> http://www.open-lab.net/blog/?p=63525 2023-08-18T20:50:31Z 2023-04-19T16:29:08Z This hands-on workshop guides you through the process of voice-enabling your product, from familiarizing yourself with NVIDIA Riva to assessing the costs and...]]> This hands-on workshop guides you through the process of voice-enabling your product, from familiarizing yourself with NVIDIA Riva to assessing the costs and...

This hands-on workshop guides you through the process of voice-enabling your product, from familiarizing yourself with NVIDIA Riva to assessing the costs and resources required for your project.

Source

]]>
0
Nicola Sessions <![CDATA[ICYMI: New and Updated AI Workflows Announced at NVIDIA GTC 2023]]> http://www.open-lab.net/blog/?p=62115 2023-03-30T17:43:51Z 2023-03-22T15:00:00Z NVIDIA showed how AI workflows can be leveraged to help you accelerate the development of AI solutions to address a range of use cases at NVIDIA GTC 2023. AI...]]> NVIDIA showed how AI workflows can be leveraged to help you accelerate the development of AI solutions to address a range of use cases at NVIDIA GTC 2023. AI...Gif scrolling through different workflow examples such as personalized recommenders, speech AI, and route optimization.

NVIDIA showed how AI workflows can be leveraged to help you accelerate the development of AI solutions to address a range of use cases at NVIDIA GTC 2023. AI workflows are cloud-native, packaged reference examples showing how NVIDIA AI frameworks can be used to efficiently build AI solutions such as intelligent virtual assistants, digital fingerprinting for cybersecurity��

Source

]]>
0
Michelle Horton <![CDATA[Top Conversational AI Sessions at NVIDIA GTC 2023]]> http://www.open-lab.net/blog/?p=61425 2023-03-09T19:18:45Z 2023-02-28T19:30:27Z Learn about the latest tools, trends, and technologies for building and deploying conversational AI.]]> Learn about the latest tools, trends, and technologies for building and deploying conversational AI.4 images showing different applications for conversational AI such as virtual assistants and avatars.

Learn about the latest tools, trends, and technologies for building and deploying conversational AI.

Source

]]>
0
Michelle Horton <![CDATA[Top Speech AI Developer Day Sessions at NVIDIA GTC 2023]]> http://www.open-lab.net/blog/?p=60997 2023-03-14T19:01:01Z 2023-02-14T22:00:00Z Explore the latest advances in accurate and customizable automatic speech recognition, multi-language translation, and text-to-speech.]]> Explore the latest advances in accurate and customizable automatic speech recognition, multi-language translation, and text-to-speech.Black background with bright green sound waves and GTC banner in the corner.

Explore the latest advances in accurate and customizable automatic speech recognition, multi-language translation, and text-to-speech.

Source

]]>
0
David Taubenheim <![CDATA[Speech AI Spotlight: How Pendulum Nabs Harmful Narratives Online]]> http://www.open-lab.net/blog/?p=60694 2023-11-03T07:15:05Z 2023-02-08T17:00:00Z Over 55% of the global population uses social media, easily sharing online content with just one click. While connecting with others and consuming entertaining...]]> Over 55% of the global population uses social media, easily sharing online content with just one click. While connecting with others and consuming entertaining...

Over 55% of the global population uses social media, easily sharing online content with just one click. While connecting with others and consuming entertaining content, you can also spot harmful narratives posing real-life threats. That��s why VP of Engineering at Pendulum, Ammar Haris, wants his company��s AI to help clients to gain deeper insight into the harmful content being generated��

Source

]]>
1
Dima Rekesh <![CDATA[Multilingual and Code-Switched Automatic Speech Recognition with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=60289 2023-11-03T07:15:06Z 2023-01-31T17:00:00Z Multilingual automatic speech recognition (ASR) models have gained significant interest because of their ability to transcribe speech in more than one language....]]> Multilingual automatic speech recognition (ASR) models have gained significant interest because of their ability to transcribe speech in more than one language....

Multilingual automatic speech recognition (ASR) models have gained significant interest because of their ability to transcribe speech in more than one language. This is fueled by the growing multilingual communities as well as by the need to reduce complexity. You only need one model to handle multiple languages. This post explains how to use pretrained multilingual NeMo ASR models from the��

Source

]]>
0
Michelle Horton <![CDATA[New Hands-on Lab: Intelligent Virtual Assistant]]> http://www.open-lab.net/blog/?p=60074 2023-06-12T07:58:15Z 2023-01-24T21:00:00Z Learn to build an engaging and intelligent virtual assistant with NVIDIA AI workflows powered by NVIDIA Riva in this free hands-on lab from NVIDIA LaunchPad,]]> Learn to build an engaging and intelligent virtual assistant with NVIDIA AI workflows powered by NVIDIA Riva in this free hands-on lab from NVIDIA LaunchPad,Illustration of a cell phone with a virtual assistant helping with a purchase.

Learn to build an engaging and intelligent virtual assistant with NVIDIA AI workflows powered by NVIDIA Riva in this free hands-on lab from NVIDIA LaunchPad��

Source

]]>
0
Gordana Neskovic <![CDATA[Upcoming Webinar: Building an Intelligent Virtual Assistant for Financial Services]]> http://www.open-lab.net/blog/?p=59470 2023-08-18T20:43:13Z 2023-01-11T17:00:00Z Join this webinar on January 25 and learn how to build a voice-enabled intelligent virtual assistant to improve customer experiences at contact centers.]]> Join this webinar on January 25 and learn how to build a voice-enabled intelligent virtual assistant to improve customer experiences at contact centers.Graphic of a woman sitting on a couch reading her phone with popups.

Join this webinar on January 25 and learn how to build a voice-enabled intelligent virtual assistant to improve customer experiences at contact centers.

Source

]]>
0
Nefi Alarcon <![CDATA[OpenAI Presents GPT-3, a 175 Billion Parameters Language Model]]> https://news.www.open-lab.net/?p=17148 2023-06-12T21:16:13Z 2020-07-07T19:49:00Z OpenAI researchers recently released a paper describing the development of GPT-3, a state-of-the-art language model made up of 175 billion parameters.  For...]]> OpenAI researchers recently released a paper describing the development of GPT-3, a state-of-the-art language model made up of 175 billion parameters.  For...

OpenAI researchers recently released a paper describing the development of GPT-3, a state-of-the-art language model made up of 175 billion parameters. For comparison, the previous version, GPT-2, was made up of 1.5 billion parameters. The largest Transformer-based language model was released by Microsoft earlier this month and is made up of 17 billion parameters. ��GPT-3 achieves strong��

Source

]]>
0
���˳���97caoporen����