Conversational AI – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-11T16:19:32Z http://www.open-lab.net/blog/feed/ Michelle Horton <![CDATA[Top Conversational AI Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96694 2025-03-06T19:26:36Z 2025-03-04T19:00:00Z Learn how to accelerate the full pipeline, from multilingual speech recognition and translation to generative AI and speech synthesis.]]>

Learn how to accelerate the full pipeline, from multilingual speech recognition and translation to generative AI and speech synthesis.

Source

]]>
Aditi Bodhankar <![CDATA[Measuring the Effectiveness and Performance of AI Guardrails in Generative AI Applications]]> http://www.open-lab.net/blog/?p=96562 2025-03-06T19:26:38Z 2025-03-03T17:22:09Z Safeguarding AI agents and other conversational AI applications to ensure safe, on-brand and reliable behavior is essential for enterprises. NVIDIA NeMo...]]>

Safeguarding AI agents and other conversational AI applications to ensure safe, on-brand and reliable behavior is essential for enterprises. NVIDIA NeMo Guardrails offers robust protection with AI guardrails for content safety, topic control, jailbreak detection, and more to evaluate and optimize guardrail performance. In this post, we explore techniques for measuring and optimizing your AI…

Source

]]>
Sangjune Park <![CDATA[Spotlight: NAVER Place Optimizes SLM-Based Vertical Services with NVIDIA TensorRT-LLM]]> http://www.open-lab.net/blog/?p=96279 2025-03-06T19:26:41Z 2025-02-28T17:57:49Z NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of...]]>

NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of businesses and points of interest across Korea. Users can search about different places, leave reviews, and place bookings or orders in real time. NAVER Place vertical services are based on small language models (SLMs) to improve usability…

Source

]]>
Yifan Wu <![CDATA[Accelerating Scientific Literature Reviews with NVIDIA NIM Microservices for LLMs]]> http://www.open-lab.net/blog/?p=96324 2025-03-06T19:26:43Z 2025-02-26T17:00:00Z A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a...]]>

A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a structured overview of the domain. For experts, it refines their understanding and sparks new ideas. In 2024 alone, 218,650 review articles were indexed in the Web of Science database, highlighting the importance of these resources in research.

Source

]]>
Sven Chilton <![CDATA[Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT]]> http://www.open-lab.net/blog/?p=95339 2025-03-06T19:26:55Z 2025-02-20T18:54:48Z NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a...]]>

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a collection of GPU-accelerated speech and translation AI microservices for ASR, TTS, and NMT, support English-Spanish and English-Japanese code-switching ASR models based on the Conformer architecture, along with a model supporting multiple…

Source

]]>
Cheng-Han (Hank) Du <![CDATA[Improving Translation Quality with Domain-Specific Fine-Tuning and NVIDIA NIM]]> http://www.open-lab.net/blog/?p=95756 2025-02-06T19:33:46Z 2025-02-05T21:30:00Z Translation plays an essential role in enabling companies to expand across borders, with requirements varying significantly in terms of tone, accuracy, and...]]>

Translation plays an essential role in enabling companies to expand across borders, with requirements varying significantly in terms of tone, accuracy, and technical terminology handling. The emergence of sovereign AI has highlighted critical challenges in large language models (LLMs), particularly their struggle to capture nuanced cultural and linguistic contexts beyond English-dominant…

Source

]]>
1
Dan Su <![CDATA[Announcing Nemotron-CC: A Trillion-Token English Language Dataset for LLM Pretraining]]> http://www.open-lab.net/blog/?p=94818 2025-01-23T19:54:30Z 2025-01-09T19:20:16Z NVIDIA is excited to announce the release of Nemotron-CC, a 6.3-trillion-token English language Common Crawl dataset for pretraining highly accurate large...]]>

NVIDIA is excited to announce the release of Nemotron-CC, a 6.3-trillion-token English language Common Crawl dataset for pretraining highly accurate large language models (LLMs), including 1.9 trillion tokens of synthetically generated data. One of the keys to training state-of-the-art LLMs is a high-quality pretraining dataset, and recent top LLMs, such as the Meta Llama series…

Source

]]>
Brad Nemire <![CDATA[Upcoming Livestream: NVIDIA Developer Highlights from CES 2025]]> http://www.open-lab.net/blog/?p=94843 2025-01-23T19:54:32Z 2025-01-09T10:00:00Z Tune in January 16th at 9:00 AM PT for a live recap, followed by a Q&A of the latest developer announcements at CES 2025.]]>

Tune in January 16th at 9:00 AM PT for a live recap, followed by a Q&A of the latest developer announcements at CES 2025.

Source

]]>
Katie Link <![CDATA[Build a Generative AI Medical Device Training Assistant with NVIDIA NIM Microservices]]> http://www.open-lab.net/blog/?p=94379 2024-12-20T19:55:30Z 2024-12-20T18:00:00Z Innovation in medical devices continues to accelerate, with a record number authorized by the FDA every year. When these new or updated devices are introduced...]]>

Innovation in medical devices continues to accelerate, with a record number authorized by the FDA every year. When these new or updated devices are introduced to clinicians and patients, they require training to use them properly and safely. Once in use, clinicians or patients may need help troubleshooting issues. Medical devices are often accompanied by lengthy and technically complex…

Source

]]>
Joseph Lucas <![CDATA[Sandboxing Agentic AI Workflows with WebAssembly]]> http://www.open-lab.net/blog/?p=93975 2024-12-16T21:06:56Z 2024-12-16T20:33:46Z Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this...]]>

Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this code should be sanitized and executed in a safe environment to mitigate risks from prompt injection and errors in the returned code. Sanitizing Python with regular expressions and restricted runtimes is insufficient…

Source

]]>
Isabel Hulseman <![CDATA[Three Building Blocks for Creating AI Virtual Assistants for Customer Service with an NVIDIA AI Blueprint]]> http://www.open-lab.net/blog/?p=90672 2024-12-12T19:35:14Z 2024-12-11T23:49:16Z In today's fast-paced business environment, providing exceptional customer service is no longer just a nice-to-have��it's a necessity. Whether addressing...]]>

In today’s fast-paced business environment, providing exceptional customer service is no longer just a nice-to-have—it’s a necessity. Whether addressing technical issues, resolving billing questions, or providing service updates, customers expect quick, accurate, and personalized responses at their convenience. However, achieving this level of service comes with significant challenges.

Source

]]>
Xin Dong <![CDATA[Hymba Hybrid-Head Architecture Boosts Small Language Model Performance]]> http://www.open-lab.net/blog/?p=92595 2024-12-12T19:38:36Z 2024-11-22T17:31:14Z Transformers, with their attention-based architecture, have become the dominant choice for language models (LMs) due to their strong performance,...]]>

Transformers, with their attention-based architecture, have become the dominant choice for language models (LMs) due to their strong performance, parallelization capabilities, and long-term recall through key-value (KV) caches. However, their quadratic computational cost and high memory demands pose efficiency challenges. In contrast, state space models (SSMs) like Mamba and Mamba-2 offer constant…

Source

]]>
Xhoni Shollaj <![CDATA[Create a Custom Slackbot LLM Agent with NVIDIA NIM and LangChain]]> http://www.open-lab.net/blog/?p=89825 2025-02-17T05:12:38Z 2024-11-19T17:00:00Z In the dynamic world of modern business, where communication and efficient workflows are crucial for success, AI-powered solutions have become a competitive...]]>

In the dynamic world of modern business, where communication and efficient workflows are crucial for success, AI-powered solutions have become a competitive advantage. AI agents, built on cutting-edge large language models (LLMs) and powered by NVIDIA NIM provide a seamless way to enhance productivity and information flow. NIM, part of NVIDIA AI Enterprise, is a suite of easy-to-use…

Source

]]>
1
Chris Krapu <![CDATA[Creating RAG-Based Question-and-Answer LLM Workflows at NVIDIA]]> http://www.open-lab.net/blog/?p=90872 2024-11-11T20:00:23Z 2024-10-28T16:00:00Z The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system...]]>

The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system architectures. Our work at NVIDIA using AI for internal operations has led to several important findings for finding alignment between system capabilities and user expectations. We found that regardless of the intended scope or use case…

Source

]]>
Maggie Zhang <![CDATA[Scaling LLMs with NVIDIA Triton and NVIDIA TensorRT-LLM Using Kubernetes]]> http://www.open-lab.net/blog/?p=90412 2025-01-30T22:30:29Z 2024-10-22T16:53:55Z Large language models (LLMs) have been widely used for chatbots, content generation, summarization, classification, translation, and more. State-of-the-art LLMs...]]>

Large language models (LLMs) have been widely used for chatbots, content generation, summarization, classification, translation, and more. State-of-the-art LLMs and foundation models, such as Llama, Gemma, GPT, and Nemotron, have demonstrated human-like understanding and generative abilities. Thanks to these models, AI developers do not need to go through the expensive and time consuming training…

Source

]]>
Maryam Ashoori <![CDATA[IBM��s New Granite 3.0 Generative AI Models Are Small, Yet Highly Accurate and Efficient]]> http://www.open-lab.net/blog/?p=90636 2024-11-22T23:09:36Z 2024-10-21T19:15:35Z Today, IBM released the third generation of IBM Granite, a collection of open language models and complementary tools. Prior generations of Granite focused on...]]>

Today, IBM released the third generation of IBM Granite, a collection of open language models and complementary tools. Prior generations of Granite focused on domain-specific use cases; the latest IBM Granite models meet or exceed the performance of leading similarly sized open models across both academic and enterprise benchmarks. The developer-friendly Granite 3.0 generative AI models are…

Source

]]>
Anurag Guda https://www.linkedin.com/in/anuragguda/ <![CDATA[Simplify AI Application Development with NVIDIA Cloud Native Stack]]> http://www.open-lab.net/blog/?p=89970 2024-10-29T21:00:38Z 2024-10-16T16:00:00Z In the rapidly evolving landscape of AI and data science, the demand for scalable, efficient, and flexible infrastructure has never been higher. Traditional...]]>

In the rapidly evolving landscape of AI and data science, the demand for scalable, efficient, and flexible infrastructure has never been higher. Traditional infrastructure can often struggle to meet the demands of modern AI workloads, leading to bottlenecks in development and deployment processes. As organizations strive to deploy AI models and data-intensive applications at scale…

Source

]]>
Amit Bleiweiss <![CDATA[Evaluating Medical RAG with NVIDIA AI Endpoints and Ragas]]> http://www.open-lab.net/blog/?p=89625 2024-11-07T23:29:42Z 2024-10-01T16:00:00Z In the rapidly evolving field of medicine, the integration of cutting-edge technologies is crucial for enhancing patient care and advancing research. One such...]]>

In the rapidly evolving field of medicine, the integration of cutting-edge technologies is crucial for enhancing patient care and advancing research. One such innovation is retrieval-augmented generation (RAG), which is transforming how medical information is processed and used. RAG combines the capabilities of large language models (LLMs) with external knowledge retrieval…

Source

]]>
Nick Comly <![CDATA[Low Latency Inference Chapter 2: Blackwell is Coming. NVIDIA GH200 NVL32 with NVLink Switch Gives Signs of Big Leap in Time to First Token Performance]]> http://www.open-lab.net/blog/?p=88938 2024-11-29T21:06:06Z 2024-09-26T21:44:00Z Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding...]]>

Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding to user queries quickly to deliver positive user experiences. The time that it takes for an LLM to ingest a user prompt (and context, which can be sizable) and begin outputting a response is called time to first token (TTFT).

Source

]]>
Vinay Bagade <![CDATA[Build a Digital Human Interface for AI Apps with an NVIDIA NIM Agent Blueprint]]> http://www.open-lab.net/blog/?p=89345 2024-10-22T20:34:33Z 2024-09-25T20:30:00Z Providing customers with quality service remains a top priority for businesses across industries, from answering questions and troubleshooting issues to...]]>

Providing customers with quality service remains a top priority for businesses across industries, from answering questions and troubleshooting issues to facilitating online orders. As businesses scale operations and expand offerings globally to compete, the demand for seamless customer service grows exponentially. Searching knowledge base articles or navigating complex phone trees can be a…

Source

]]>
Anjali Shah <![CDATA[Deploying Accelerated Llama 3.2 from the Edge to the Cloud]]> http://www.open-lab.net/blog/?p=89436 2024-11-07T05:08:12Z 2024-09-25T18:39:49Z Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an...]]>

Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an updated Llama Guard model with support for vision. When paired with the NVIDIA accelerated computing platform, Llama 3.2 offers developers, researchers, and enterprises valuable new capabilities and optimizations to realize their…

Source

]]>
Daniel Galvez <![CDATA[Accelerating Leaderboard-Topping ASR Models 10x with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=89330 2024-10-17T19:07:17Z 2024-09-24T18:27:35Z NVIDIA NeMo has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry, particularly those topping the Hugging...]]>

NVIDIA NeMo has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry, particularly those topping the Hugging Face Open ASR Leaderboard. These NVIDIA NeMo ASR models that transcribe speech into text offer a range of architectures designed to optimize both speed and accuracy: Previously, these models faced speed performance…

Source

]]>
Sven Chilton <![CDATA[Quickly Voice Your Apps with NVIDIA NIM Microservices for Speech and Translation]]> http://www.open-lab.net/blog/?p=89142 2024-09-19T20:17:19Z 2024-09-18T22:48:43Z NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models...]]>

NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and workstations. NIM microservices for speech and translation are now available. The new speech and translation microservices leverage NVIDIA Riva and provide automatic speech recognition (ASR)…

Source

]]>
Aaron Erickson <![CDATA[Optimizing Data Center Performance with AI Agents and the OODA Loop Strategy]]> http://www.open-lab.net/blog/?p=88729 2025-02-17T05:11:15Z 2024-09-17T14:30:00Z For any data center, operating large, complex GPU clusters is not for the faint of heart! There is a tremendous amount of complexity. Cooling, power,...]]>

For any data center, operating large, complex GPU clusters is not for the faint of heart! There is a tremendous amount of complexity. Cooling, power, networking, and even such benign things like fan replacement cycles all must be managed effectively and governed well in accelerated computing data centers. Managing all of this requires an accelerated understanding of the petabytes of telemetry data…

Source

]]>
10
Jan Lasek <![CDATA[Post-Training Quantization of LLMs with NVIDIA NeMo and NVIDIA TensorRT Model Optimizer]]> http://www.open-lab.net/blog/?p=88489 2024-09-19T19:33:05Z 2024-09-10T16:00:00Z As large language models (LLMs) are becoming even bigger, it is increasingly important to provide easy-to-use and efficient deployment paths because the cost of...]]>

As large language models (LLMs) are becoming even bigger, it is increasingly important to provide easy-to-use and efficient deployment paths because the cost of serving such LLMs is becoming higher. One way to reduce this cost is to apply post-training quantization (PTQ), which consists of techniques to reduce computational and memory requirements for serving trained models. In this post…

Source

]]>
Sang-gil Lee <![CDATA[Achieving State-of-the-Art Zero-Shot Waveform Audio Generation across Audio Types]]> http://www.open-lab.net/blog/?p=88329 2024-09-19T19:34:33Z 2024-09-05T20:30:00Z Stunning audio content is an essential component of virtual worlds. Audio generative AI plays a key role in creating this content, and NVIDIA is continuously...]]>

Stunning audio content is an essential component of virtual worlds. Audio generative AI plays a key role in creating this content, and NVIDIA is continuously pushing the limits in this field of research. BigVGAN, developed in collaboration with the NVIDIA Applied Deep Learning Research and NVIDIA NeMo teams, is a generative AI model specialized in audio waveform synthesis that achieves state-of…

Source

]]>
Annamalai Chockalingam <![CDATA[Deploy Diverse AI Apps with Multi-LoRA Support on RTX AI PCs and Workstations]]> http://www.open-lab.net/blog/?p=88097 2024-11-14T16:09:00Z 2024-08-28T13:00:00Z Today��s large language models (LLMs) achieve unprecedented results across many use cases. Yet, application developers often need to customize and tune these...]]>

Today’s large language models (LLMs) achieve unprecedented results across many use cases. Yet, application developers often need to customize and tune these models to work specifically for their use cases, due to the general nature of foundation models. Full fine-tuning requires a large amount of data and compute infrastructure, resulting in model weights being updated.

Source

]]>
Davide Tricarico <![CDATA[Enhancing RAG Applications with NVIDIA NIM]]> http://www.open-lab.net/blog/?p=87747 2024-10-28T21:55:21Z 2024-08-27T16:00:00Z The advent of large language models (LLMs) has significantly benefited the AI industry, offering versatile tools capable of generating human-like text and...]]>

The advent of large language models (LLMs) has significantly benefited the AI industry, offering versatile tools capable of generating human-like text and handling a wide range of tasks. However, while LLMs demonstrate impressive general knowledge, their performance in specialized fields, such as veterinary science, is limited when used out of the box. To enhance their utility in specific areas…

Source

]]>
Michelle Horton <![CDATA[Practical Strategies for Optimizing LLM Inference Sizing and Performance]]> http://www.open-lab.net/blog/?p=87511 2024-09-05T17:57:29Z 2024-08-21T16:00:00Z As the use of large language models (LLMs) grows across many applications, such as chatbots and content creation, it's important to understand the process of...]]>

As the use of large language models (LLMs) grows across many applications, such as chatbots and content creation, it’s important to understand the process of scaling and optimizing inference systems to make informed decisions about hardware and resources for LLM inference. In the following talk, Dmitry Mironov and Sergio Perez, senior deep learning solutions architects at NVIDIA…

Source

]]>
Sama Bali <![CDATA[Hackathon: Build Groundbreaking Generative AI Projects Using NVIDIA AI Workbench]]> http://www.open-lab.net/blog/?p=87736 2024-09-05T17:57:30Z 2024-08-20T20:04:23Z Hosted by Dell and NVIDIA, demonstrate how AI Workbench can be used to build and deliver apps for a wide range of tasks and workflows.]]>

Hosted by Dell and NVIDIA, demonstrate how AI Workbench can be used to build and deliver apps for a wide range of tasks and workflows.

Source

]]>
Ike Nnoli <![CDATA[Deploy the First On-Device Small Language Model for Improved Game Character Roleplay]]> http://www.open-lab.net/blog/?p=87302 2024-08-22T18:24:51Z 2024-08-20T13:05:00Z At Gamescom 2024, NVIDIA announced our first on-device small language model (SLM) for improving the conversation abilities of game characters. We also announced...]]>

At Gamescom 2024, NVIDIA announced our first on-device small language model (SLM) for improving the conversation abilities of game characters. We also announced that the first game to showcase NVIDIA ACE and digital human technologies is Amazing Seasun Games’ Mecha BREAK, bringing its characters to life and providing a more dynamic and immersive gameplay experience on NVIDIA GeForce RTX AI PCs.

Source

]]>
Erin Ho <![CDATA[NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support]]> http://www.open-lab.net/blog/?p=87227 2024-08-22T18:24:54Z 2024-08-15T17:11:37Z NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques...]]>

NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques including quantization, sparsity, and pruning. These techniques reduce model complexity and enable downstream inference frameworks like NVIDIA TensorRT-LLM and NVIDIA TensorRT to more efficiently optimize the inference speed of generative AI…

Source

]]>
Sepi Motamedi <![CDATA[Video: Build Live Media Applications for AI-Enabled Infrastructure with NVIDIA Holoscan for Media]]> http://www.open-lab.net/blog/?p=87234 2024-11-04T22:50:16Z 2024-08-14T17:35:11Z NVIDIA Holoscan for Media is a software-defined, AI-enabled platform that enables live video pipelines to run on the same infrastructure as AI.  This video...]]>

NVIDIA Holoscan for Media is a software-defined, AI-enabled platform that enables live video pipelines to run on the same infrastructure as AI. This video explains how developers in live media can use NVIDIA Holoscan for Media to build and deploy applications as software on repurposable, NVIDIA-accelerated, commercial off-the-shelf hardware. The video features Guillaume Polaillon…

Source

]]>
Chintan Patel <![CDATA[New NIM Available: Mistral Large 2 Instruct LLM]]> http://www.open-lab.net/blog/?p=87308 2024-08-22T18:24:59Z 2024-08-13T20:37:24Z The new model by Mistral excels at a variety of complex tasks including text summarization, multilingual translation and reasoning, programming, question and...]]>

The new model by Mistral excels at a variety of complex tasks including text summarization, multilingual translation and reasoning, programming, question and answering, and conversational AI.

Source

]]>
Hayden Wolff <![CDATA[Building AI Agents with NVIDIA NIM Microservices and LangChain]]> http://www.open-lab.net/blog/?p=86543 2024-10-28T21:55:34Z 2024-08-07T16:00:00Z NVIDIA NIM, part of NVIDIA AI Enterprise, now supports tool-calling for models like Llama 3.1. It also integrates with LangChain to provide you with a...]]>

NVIDIA NIM, part of NVIDIA AI Enterprise, now supports tool-calling for models like Llama 3.1. It also integrates with LangChain to provide you with a production-ready solution for building agentic workflows. NIM microservices provide the best performance for open-source models such as Llama 3.1 and are available to test for free from NVIDIA API Catalog in LangChain applications.

Source

]]>
Kasikrit Chantharuang <![CDATA[Securing Generative AI Deployments with NVIDIA NIM and NVIDIA NeMo Guardrails]]> http://www.open-lab.net/blog/?p=86615 2024-11-20T19:58:44Z 2024-08-05T20:30:00Z As enterprises adopt generative AI applications powered by large language models (LLMs), there is an increasing need to implement guardrails to ensure safety...]]>

As enterprises adopt generative AI applications powered by large language models (LLMs), there is an increasing need to implement guardrails to ensure safety and compliance with principles of trustworthy AI. NVIDIA NeMo Guardrails provides programmable guardrails for ensuring trustworthiness, safety, security, and controlled dialog while protecting against common LLM vulnerabilities.

Source

]]>
Sofia Kostandian <![CDATA[Developing Robust Georgian Automatic Speech Recognition with FastConformer Hybrid Transducer CTC BPE]]> http://www.open-lab.net/blog/?p=85835 2024-08-22T18:25:43Z 2024-08-05T16:52:11Z Building an effective automatic speech recognition (ASR) model for underrepresented languages presents unique challenges due to limited data resources.  In...]]>

Source

]]>
Amit Bleiweiss <![CDATA[Enhancing RAG Pipelines with Re-Ranking]]> http://www.open-lab.net/blog/?p=86037 2024-10-28T21:56:26Z 2024-07-30T16:00:00Z In the rapidly evolving landscape of AI-driven applications, re-ranking has emerged as a pivotal technique to enhance the precision and relevance of enterprise...]]>

In the rapidly evolving landscape of AI-driven applications, re-ranking has emerged as a pivotal technique to enhance the precision and relevance of enterprise search results. By using advanced machine learning algorithms, re-ranking refines initial search outputs to better align with user intent and context, thereby significantly improving the effectiveness of semantic search.

Source

]]>
Yasmina Benkhoui <![CDATA[Spotlight: UneeQ Revolutionizes Customer Engagement with AI-Powered Digital Human Technology]]> http://www.open-lab.net/blog/?p=82662 2024-08-19T17:56:31Z 2024-07-18T22:31:45Z With the rise of chatbots and virtual assistants, customer interactions have evolved to embrace the versatility of voice and text inputs. However, integrating...]]>

With the rise of chatbots and virtual assistants, customer interactions have evolved to embrace the versatility of voice and text inputs. However, integrating visual and personalized components into these interactions is essential for creating immersive, user-centric experiences. Enter UneeQ, a leading platform known for its creation of lifelike digital characters through AI-powered…

Source

]]>
Artem Chirkin <![CDATA[Accelerating Vector Search: NVIDIA cuVS IVF-PQ Part 2, Performance Tuning]]> http://www.open-lab.net/blog/?p=81681 2024-10-03T21:18:45Z 2024-07-18T17:10:03Z In the first part of the series, we presented an overview of the IVF-PQ algorithm and explained how it builds on top of the IVF-Flat algorithm, using the...]]>

In the first part of the series, we presented an overview of the IVF-PQ algorithm and explained how it builds on top of the IVF-Flat algorithm, using the Product Quantization (PQ) technique to compress the index and support larger datasets. In this part two of the IVF-PQ post, we cover the practical aspects of tuning IVF-PQ performance. It’s worth noting again that IVF-PQ uses a lossy…

Source

]]>
Artem Chirkin <![CDATA[Accelerating Vector Search: NVIDIA cuVS IVF-PQ Part 1, Deep Dive]]> http://www.open-lab.net/blog/?p=81652 2024-10-03T21:19:09Z 2024-07-18T17:09:45Z In this post, we continue the series on accelerating vector search using NVIDIA cuVS. Our previous post in the series introduced IVF-Flat, a fast algorithm for...]]>

In this post, we continue the series on accelerating vector search using NVIDIA cuVS. Our previous post in the series introduced IVF-Flat, a fast algorithm for accelerating approximate nearest neighbors (ANN) search on GPUs. We discussed how using an inverted file index (IVF) provides an intuitive way to reduce the complexity of the nearest neighbor search by limiting it to only a small subset of…

Source

]]>
Ashraf Eassa <![CDATA[NVIDIA NeMo Accelerates LLM Innovation with Hybrid State Space Model Support]]> http://www.open-lab.net/blog/?p=85602 2024-08-08T18:48:47Z 2024-07-17T17:32:08Z Today��s large language models (LLMs) are based on the transformer model architecture introduced in 2017. Since then, rapid advances in AI compute performance...]]>

Today’s large language models (LLMs) are based on the transformer model architecture introduced in 2017. Since then, rapid advances in AI compute performance have enabled the creation of even larger transformer-based LLMs, dramatically improving their capabilities. Advanced transformer-based LLMs are enabling many exciting applications such as intelligent chatbots, computer code generation…

Source

]]>
1
Tianna Nguy <![CDATA[New Workshops: Customize LLMs, Build and Deploy Large Neural Networks]]> http://www.open-lab.net/blog/?p=85505 2024-08-08T18:48:51Z 2024-07-16T21:39:50Z Register now for an instructor-led public workshop in July, August or September. Space is limited.]]>

Register now for an instructor-led public workshop in July, August or September. Space is limited.

Source

]]>
Erin Ho <![CDATA[Train Generative AI Models More Efficiently with New NVIDIA Megatron-Core Functionalities]]> http://www.open-lab.net/blog/?p=84953 2024-07-25T18:14:45Z 2024-07-12T22:25:42Z First introduced in 2019, NVIDIA Megatron-LM sparked a wave of innovation in the AI community, enabling researchers and developers to use the underpinnings of...]]>

First introduced in 2019, NVIDIA Megatron-LM sparked a wave of innovation in the AI community, enabling researchers and developers to use the underpinnings of this open-source library to further large language model (LLM) advancements. Today, many of the most popular LLM developer frameworks have been inspired by and built using the Megatron-LM library, spurring a wave of foundation models and AI…

Source

]]>
Subhankar Ghosh <![CDATA[Addressing Hallucinations in Speech Synthesis LLMs with the NVIDIA NeMo T5-TTS Model]]> http://www.open-lab.net/blog/?p=84524 2024-07-25T18:19:15Z 2024-07-02T20:00:00Z NVIDIA NeMo has released the T5-TTS model, a significant advancement in text-to-speech (TTS) technology. Based on large language models (LLMs), T5-TTS produces...]]>

NVIDIA NeMo has released the T5-TTS model, a significant advancement in text-to-speech (TTS) technology. Based on large language models (LLMs), T5-TTS produces more accurate and natural-sounding speech. By improving alignment between text and audio, T5-TTS eliminates hallucinations such as repeated spoken words and skipped text. Additionally, T5-TTS makes up to 2x fewer word pronunciation errors…

Source

]]>
Min-Hung Chen https://minhungchen.netlify.app/ <![CDATA[Introducing DoRA, a High-Performing Alternative to LoRA for Fine-Tuning]]> http://www.open-lab.net/blog/?p=84454 2024-11-07T05:09:12Z 2024-06-28T15:00:00Z Full fine-tuning (FT) is commonly employed to tailor general pretrained models for specific downstream tasks. To reduce the training cost, parameter-efficient...]]>

Full fine-tuning (FT) is commonly employed to tailor general pretrained models for specific downstream tasks. To reduce the training cost, parameter-efficient fine-tuning (PEFT) methods have been introduced to fine-tune pretrained models with a minimal number of parameters. Among these, Low-Rank Adaptation (LoRA) and its variants have gained considerable popularity because they avoid additional…

Source

]]>
Hannah Simmons <![CDATA[Generate High-Quality, Context-Aware Responses for Chatbots and Search Engines with Llama 3-ChatQA]]> http://www.open-lab.net/blog/?p=84548 2024-07-10T15:28:34Z 2024-06-26T16:44:52Z Experience and test Llama3-ChatQA models at scale with performance optimized NVIDIA NIM inference microservice using the NVIDIA API catalog.]]>

Experience and test Llama3-ChatQA models at scale with performance optimized NVIDIA NIM inference microservice using the NVIDIA API catalog.

Source

]]>
Elias Wolfberg <![CDATA[AI Brain Implant Restores Bilingual Communication for Stroke Survivor]]> http://www.open-lab.net/blog/?p=84040 2024-06-27T18:17:55Z 2024-06-20T15:57:05Z Scientists have enabled a stroke survivor, who is unable to speak, to communicate in both Spanish and English by training a neuroprosthesis implant to decode...]]>

Scientists have enabled a stroke survivor, who is unable to speak, to communicate in both Spanish and English by training a neuroprosthesis implant to decode his bilingual brain activity. The research, published in Nature Biomedical Engineering, comes from the lab of University of California, San Francisco professor Dr. Edward Chang. It builds on his groundbreaking work from 2021 with the…

Source

]]>
Babak Hejazi <![CDATA[Introducing Grouped GEMM APIs in cuBLAS and More Performance Updates]]> http://www.open-lab.net/blog/?p=83888 2024-07-16T17:19:07Z 2024-06-12T20:30:00Z The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance...]]>

The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance computing (HPC) workloads. This post provides an overview of the following updates on cuBLAS matrix multiplications (matmuls) since version 12.0, and a walkthrough: Grouped GEMM APIs can be viewed as a generalization of the batched…

Source

]]>
Tanay Varshney <![CDATA[NVIDIA Text Embedding Model Tops MTEB Leaderboard]]> http://www.open-lab.net/blog/?p=83571 2024-10-28T21:57:46Z 2024-06-10T17:00:00Z The latest embedding model from NVIDIA��NV-Embed��set a new record for embedding accuracy with a score of 69.32 on the Massive Text Embedding Benchmark...]]>

The latest embedding model from NVIDIA—NV-Embed—set a new record for embedding accuracy with a score of 69.32 on the Massive Text Embedding Benchmark (MTEB), which covers 56 embedding tasks. Highly accurate and effective models like NV-Embed are key to transforming vast amounts of data into actionable insights. NVIDIA provides top-performing models through the NVIDIA API catalog.

Source

]]>
Ike Nnoli <![CDATA[Build Lifelike Digital Human Technology with NVIDIA ACE, Now Generally Available]]> http://www.open-lab.net/blog/?p=83173 2024-11-14T16:09:51Z 2024-06-04T16:42:49Z NVIDIA ACE��a suite of generative AI-enabled digital human technologies��is now generally available for developers. Packaged as NVIDIA NIM microservices, ACE...]]>

NVIDIA ACE—a suite of generative AI-enabled digital human technologies—is now generally available for developers. Packaged as NVIDIA NIM microservices, ACE enables developers to deliver high-quality natural language understanding, speech synthesis, and facial animation for gaming, customer service, healthcare, and more. NVIDIA is also introducing ACE PC NIM microservices for deployment…

Source

]]>
Jesse Clayton <![CDATA[Streamline Development of AI-Powered Apps with NVIDIA RTX AI Toolkit for Windows RTX PCs]]> http://www.open-lab.net/blog/?p=83165 2024-11-14T16:10:37Z 2024-06-02T12:30:00Z NVIDIA today launched the NVIDIA RTX AI Toolkit, a collection of tools and SDKs for Windows application developers to customize, optimize, and deploy AI models...]]>

NVIDIA today launched the NVIDIA RTX AI Toolkit, a collection of tools and SDKs for Windows application developers to customize, optimize, and deploy AI models for Windows applications. It’s free to use, doesn’t require prior experience with AI frameworks and development tools, and delivers the best AI performance for both local and cloud deployments. The wide availability of generative…

Source

]]>
Aditi Bodhankar <![CDATA[Building Safer LLM Apps with LangChain Templates and NVIDIA NeMo Guardrails]]> http://www.open-lab.net/blog/?p=83057 2025-02-04T19:52:06Z 2024-05-31T21:37:43Z An easily deployable reference architecture can help developers get to production faster with custom LLM use cases. LangChain Templates are a new way of...]]>

An easily deployable reference architecture can help developers get to production faster with custom LLM use cases. LangChain Templates are a new way of creating, sharing, maintaining, downloading, and customizing LLM-based agents and chains. The process is straightforward. You create an application project with directories for chains, identify the template you want to work with…

Source

]]>
Nisanur Genc <![CDATA[Personalized Learning with Gipi, NVIDIA TensortRT-LLM, and AI Foundation Models]]> http://www.open-lab.net/blog/?p=82913 2024-05-30T19:55:44Z 2024-05-30T16:00:00Z Over 1.2B people are actively learning new languages, with over 500M learners on digital learning platforms such as Duolingo. At the same time, a significant...]]>

Over 1.2B people are actively learning new languages, with over 500M learners on digital learning platforms such as Duolingo. At the same time, a significant portion of the global population, including 73% of Gen-Z, experiences feelings of disconnection and unhappiness, often exacerbated by social media. This highlights a unique dichotomy: People are hungry for personalized learning…

Source

]]>
Mitesh Patel <![CDATA[Generative AI Agents Developer Contest: Top Tips for Getting Started]]> http://www.open-lab.net/blog/?p=82980 2024-10-18T20:21:31Z 2024-05-29T16:01:10Z Join our contest that runs through June 17 and showcase your innovation using cutting-edge generative AI-powered applications using NVIDIA and LangChain...]]>

Join our contest that runs through June 17 and showcase your innovation using cutting-edge generative AI-powered applications using NVIDIA and LangChain technologies. To get you started, we explore a few applications for inspiring your creative journey, while sharing tips and best practices to help you succeed in the development process. There are many different practical applications…

Source

]]>
Matthew Nicely <![CDATA[Accelerating Transformers with NVIDIA cuDNN 9]]> http://www.open-lab.net/blog/?p=82592 2024-05-30T19:55:46Z 2024-05-24T16:00:00Z The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library for accelerating deep learning primitives with state-of-the-art performance....]]>

The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library for accelerating deep learning primitives with state-of-the-art performance. cuDNN is integrated with popular deep learning frameworks like PyTorch, TensorFlow, and XLA (Accelerated Linear Algebra). These frameworks abstract the complexities of direct GPU programming, enabling you to focus on designing and…

Source

]]>
1
Nicole Luo <![CDATA[Training Localized Multilingual LLMs with NVIDIA NeMo, Part 2]]> http://www.open-lab.net/blog/?p=82295 2025-02-17T05:27:39Z 2024-05-17T17:29:49Z In Part 1, we discussed how to train a monolingual tokenizer and merge it with a pretrained LLM��s tokenizer to form a multilingual tokenizer. In this post, we...]]>

In Part 1, we discussed how to train a monolingual tokenizer and merge it with a pretrained LLM’s tokenizer to form a multilingual tokenizer. In this post, we show you how to integrate the customized tokenizer into the pretrained LLM as well as how to start a continual pretraining task in NVIDIA NeMo. Please import the following libraries before starting: After…

Source

]]>
1
Nicole Luo <![CDATA[Training Localized Multilingual LLMs with NVIDIA NeMo, Part 1]]> http://www.open-lab.net/blog/?p=82294 2024-10-18T20:22:45Z 2024-05-17T17:29:13Z In today's globalized world, the ability of AI systems to understand and communicate in diverse languages is increasingly crucial. Large language models (LLMs)...]]>

In today’s globalized world, the ability of AI systems to understand and communicate in diverse languages is increasingly crucial. Large language models (LLMs) have revolutionized the field of natural language processing, enabling AI to generate human-like text, answer questions, and perform various language tasks. However, most mainstream LLMs are trained on data corpora that primarily consist of…

Source

]]>
3
Siddha Ganju <![CDATA[Develop Secure, Reliable Medical Apps with RAG and NVIDIA NeMo Guardrails]]> http://www.open-lab.net/blog/?p=82588 2025-02-04T19:52:46Z 2024-05-15T20:00:00Z Imagine an application that can sift through mountains of patient data, intelligently searching and answering questions about diagnoses, health histories, and...]]>

Imagine an application that can sift through mountains of patient data, intelligently searching and answering questions about diagnoses, health histories, and more. This AI-powered virtual “clinical assistant” could streamline preparation for an appointment with a patient, summarize health records, and readily answer queries about an individual patient. Such a system can also be fine-tuned to…

Source

]]>
Zhiyong Ban <![CDATA[Customizing Neural Machine Translation Models with NVIDIA NeMo, Part 2]]> http://www.open-lab.net/blog/?p=82196 2025-02-17T05:23:38Z 2024-05-13T17:17:38Z In the first post, we walked through the prerequisites for a neural machine translation example from English to Chinese, running the pretrained model with NeMo,...]]>

In the first post, we walked through the prerequisites for a neural machine translation example from English to Chinese, running the pretrained model with NeMo, and evaluating its performance. In this post, we walk you through curating a custom dataset and fine-tuning the model on that dataset. Custom data collection is crucial in model fine-tuning because it enables a model to adapt to…

Source

]]>
Zhiyong Ban <![CDATA[Customizing Neural Machine Translation Models with NVIDIA NeMo, Part 1]]> http://www.open-lab.net/blog/?p=82195 2024-05-30T19:55:58Z 2024-05-13T17:15:13Z Neural machine translation (NMT) is an automatic task of translating a sequence of words from one language to another. In recent years, the development of...]]>

Neural machine translation (NMT) is an automatic task of translating a sequence of words from one language to another. In recent years, the development of attention-based transformer models has had a profound impact on complicated language modeling tasks, which predict the next upcoming token in the sentence. NMT is one of the typical instances. There are plenty of open-source NMT models…

Source

]]>
Chintan Patel <![CDATA[Regional LLMs SEA-LION and SeaLLM Serve Languages and Cultures of Southeast Asia]]> http://www.open-lab.net/blog/?p=82014 2024-05-30T19:55:59Z 2024-05-13T17:00:00Z At the recent World Governments Summit in Dubai, NVIDIA CEO Jensen Huang emphasized the importance of sovereign AI, which refers to a nation��s capability to...]]>

At the recent World Governments Summit in Dubai, NVIDIA CEO Jensen Huang emphasized the importance of sovereign AI, which refers to a nation’s capability to develop and deploy AI technologies. Nations have started building regional large language models (LLMs) that codify their culture, history, and intelligence and serve their citizens with the benefits of generative AI.

Source

]]>
Amit Bleiweiss <![CDATA[Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints]]> http://www.open-lab.net/blog/?p=81895 2025-03-11T16:19:32Z 2024-05-08T16:00:00Z Retrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more...]]>

Retrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more accurate, up-to-date, and contextually relevant responses from large language models (LLMs). By incorporating data from various sources such as relational databases, unstructured document repositories, internet data streams, and media news feeds…

Source

]]>
7
Elena Rastorgueva <![CDATA[New Standard for Speech Recognition and Translation from the NVIDIA NeMo Canary Model]]> http://www.open-lab.net/blog/?p=80661 2024-08-06T17:19:16Z 2024-04-18T20:09:33Z NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises. The NeMo team...]]>

NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere—on any cloud and on-premises. The NeMo team just released?Canary, a multilingual model that transcribes speech in English, Spanish, German, and French with punctuation and capitalization. Canary also provides bi-directional translation, between English and the three other supported…

Source

]]>
1
Hainan Xu <![CDATA[Turbocharge ASR Accuracy and Speed with NVIDIA NeMo Parakeet-TDT]]> http://www.open-lab.net/blog/?p=80732 2024-08-12T16:06:21Z 2024-04-18T20:03:54Z NVIDIA NeMo, an end-to-end platform for developing multimodal generative AI models at scale anywhere��on any cloud and on-premises��recently released...]]>

NVIDIA NeMo, an end-to-end platform for developing multimodal generative AI models at scale anywhere—on any cloud and on-premises—recently released Parakeet-TDT. This new addition to the?NeMo ASR Parakeet model family boasts better accuracy and 64% greater speed over the previously best model, Parakeet-RNNT-1.1B. This post explains Parakeet-TDT and how to use it to generate highly accurate…

Source

]]>
0
Somshubra Majumdar <![CDATA[Pushing the Boundaries of Speech Recognition with NVIDIA NeMo Parakeet ASR Models]]> http://www.open-lab.net/blog/?p=80564 2024-08-12T16:07:43Z 2024-04-18T20:03:07Z NVIDIA NeMo, an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises��released the...]]>

NVIDIA NeMo, an end-to-end platform for the development of multimodal generative AI models at scale anywhere—on any cloud and on-premises—released the Parakeet family of automatic speech recognition (ASR) models. These state-of-the-art ASR models, developed in collaboration with Suno.ai, transcribe spoken English with exceptional accuracy. This post details Parakeet ASR models that are…

Source

]]>
0
Tiffany Yeung <![CDATA[Explainer: What Is a Convolutional Neural Network?]]> http://www.open-lab.net/blog/?p=75991 2024-06-05T22:20:53Z 2024-04-12T19:00:00Z A convolutional neural network is a type of deep learning network used primarily to identify and classify images and to recognize objects within images.]]>

A convolutional neural network is a type of deep learning network used primarily to identify and classify images and to recognize objects within images.

Source

]]>
0
Amanda Saunders <![CDATA[Develop Custom Enterprise Generative AI with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=80360 2025-02-17T05:27:49Z 2024-03-27T20:00:00Z Generative AI is transforming computing, paving new avenues for humans to interact with computers in natural, intuitive ways. For enterprises, the prospect of...]]>

Generative AI is transforming computing, paving new avenues for humans to interact with computers in natural, intuitive ways. For enterprises, the prospect of generative AI is vast. Businesses can tap into their rich datasets to streamline time-consuming tasks—from text summarization and translation to insight prediction and content generation. But they must also navigate adoption challenges.

Source

]]>
Ike Nnoli <![CDATA[Generative AI for Digital Human Technologies and New AI-powered NVIDIA RTX Lighting]]> http://www.open-lab.net/blog/?p=79707 2024-12-09T16:51:28Z 2024-03-19T17:00:00Z At GDC 2024, NVIDIA announced that leading AI application developers such as Inworld AI are using NVIDIA digital human technologies to accelerate the deployment...]]>

At GDC 2024, NVIDIA announced that leading AI application developers such as Inworld AI are using NVIDIA digital human technologies to accelerate the deployment of generative AI-powered game characters alongside updated NVIDIA RTX SDKs that simplify the creation of beautiful worlds. You can incorporate the full suite of NVIDIA digital human technologies or individual microservices into…

Source

]]>
Gordana Neskovic <![CDATA[NVIDIA Speech and Translation AI Models Set Records for Speed and Accuracy]]> http://www.open-lab.net/blog/?p=79365 2024-08-12T16:09:12Z 2024-03-19T16:00:00Z Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition...]]>

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition (ASR) family of models and the NVIDIA Canary multilingual, multitask ASR and translation model currently top the Hugging Face Open ASR Leaderboard. In addition, a multilingual P-Flow-based text-to-speech (TTS) model won the LIMMITS ’24…

Source

]]>
Chester Chen <![CDATA[Turning Machine Learning to Federated Learning in Minutes with NVIDIA FLARE 2.4]]> http://www.open-lab.net/blog/?p=78870 2024-05-10T00:20:39Z 2024-03-07T00:39:33Z Federated learning (FL) is experiencing accelerated adoption due to its decentralized, privacy-preserving nature. In sectors such as healthcare and financial...]]>

Federated learning (FL) is experiencing accelerated adoption due to its decentralized, privacy-preserving nature. In sectors such as healthcare and financial services, FL, as a privacy-enhanced technology, has become a critical component of the technical stack. In this post, we discuss FL and its advantages, delving into why federated learning is gaining traction. We also introduce three key…

Source

]]>
Chintan Patel <![CDATA[Solve Complex AI Tasks with Leaderboard-Topping Smaug 72B from NVIDIA AI Foundation Models]]> http://www.open-lab.net/blog/?p=78769 2024-05-07T16:50:32Z 2024-03-04T21:22:47Z This week��s model release features the NVIDIA-optimized language model Smaug 72B, which you can experience directly from your browser. NVIDIA AI Foundation...]]>

This week’s model release features the NVIDIA-optimized language model Smaug 72B, which you can experience directly from your browser. NVIDIA AI Foundation Models and Endpoints are a curated set of community and NVIDIA-built generative AI models to experience, customize, and deploy in enterprise applications. Try leading models such as Nemotron-3, Mixtral 8x7B, Gemma 7B…

Source

]]>
Ziyue Xu <![CDATA[Scalable Federated Learning with NVIDIA FLARE for Enhanced LLM Performance]]> http://www.open-lab.net/blog/?p=78348 2024-05-10T00:21:02Z 2024-02-29T21:00:00Z In the ever-evolving landscape of large language models (LLMs), effective data management is a key challenge. Data is at the heart of model performance. While...]]>

In the ever-evolving landscape of large language models (LLMs), effective data management is a key challenge. Data is at the heart of model performance. While most advanced machine learning algorithms are data-centric, necessary data can’t always be centralized. This is due to various factors such as privacy, regulation, geopolitics, copyright issues, and the sheer effort required to move vast…

Source

]]>
0
Tanya Lenz <![CDATA[Event: Speech and Generative AI Developer Day at NVIDIA GTC 2024]]> http://www.open-lab.net/blog/?p=78609 2024-03-07T19:29:14Z 2024-02-29T21:00:00Z Learn how to build a RAG-powered application with a human voice interface at NVIDIA GTC 2024 Speech and Generative AI Developer Day.?]]>

Learn how to build a RAG-powered application with a human voice interface at NVIDIA GTC 2024 Speech and Generative AI Developer Day.

Source

]]>
0
Chia-Chih Chen <![CDATA[Unlock Your LLM Coding Potential with StarCoder2]]> http://www.open-lab.net/blog/?p=78552 2024-03-07T19:32:10Z 2024-02-28T14:00:00Z Coding is essential in the digital age, but it can also be tedious and time-consuming. That's why many developers are looking for ways to automate and...]]>

Coding is essential in the digital age, but it can also be tedious and time-consuming. That’s why many developers are looking for ways to automate and streamline their coding tasks with the help of large language models (LLMs). These models are trained on massive amounts of code from permissively licensed GitHub repositories and can generate, analyze, and document code with little human…

Source

]]>
0
Jess Nguyen <![CDATA[Video: Build a RAG-Powered Chatbot in Five Minutes]]> http://www.open-lab.net/blog/?p=78248 2024-05-02T16:46:56Z 2024-02-27T21:30:00Z Retrieval-augmented generation (RAG) is exploding in popularity as a technique for boosting large language model (LLM) application performance. From highly...]]>

Retrieval-augmented generation (RAG) is exploding in popularity as a technique for boosting large language model (LLM) application performance. From highly accurate question-answering AI chatbots to code-generation copilots, organizations across industries are exploring how RAG can help optimize processes. According to State of AI in Financial Services: 2024 Trends, 55%

Source

]]>
0
Chintan Patel <![CDATA[Unlock the Power of Small Language Model Phi-2 for Chat, Research, Coding, and More]]> http://www.open-lab.net/blog/?p=78402 2024-06-06T14:55:12Z 2024-02-27T18:00:39Z This week��s model release features the NVIDIA-optimized language model Phi-2, which can be used for a wide range of natural language processing (NLP) tasks....]]>

This week’s model release features the NVIDIA-optimized language model Phi-2, which can be used for a wide range of natural language processing (NLP) tasks. You can experience Phi-2 directly from your browser. NVIDIA AI Foundation Models and Endpoints are a curated set of community and NVIDIA-built generative AI models to experience, customize, and deploy in enterprise applications.

Source

]]>
0
Michelle Horton <![CDATA[Top Inference for Large Language Models Sessions at NVIDIA GTC 2024]]> http://www.open-lab.net/blog/?p=77749 2024-02-22T19:58:59Z 2024-02-13T17:00:00Z Learn how inference for LLMs is driving breakthrough performance for AI-enabled applications and services.]]>

Learn how inference for LLMs is driving breakthrough performance for AI-enabled applications and services.

Source

]]>
0
Brad Nemire <![CDATA[Featured Large Language Models Sessions at NVIDIA GTC 2024]]> http://www.open-lab.net/blog/?p=77649 2024-06-06T16:13:47Z 2024-02-08T02:09:25Z Speakers from NVIDIA, Meta, Microsoft, OpenAI, and ServiceNow will be talking about the latest tools, optimizations, trends and best practices for large...]]>

Speakers from NVIDIA, Meta, Microsoft, OpenAI, and ServiceNow will be talking about the latest tools, optimizations, trends and best practices for large language models (LLMs).

Source

]]>
0
Brad Nemire <![CDATA[Top Retrieval-Augmented Generation (RAG) Sessions at NVIDIA GTC 2024 Sessions]]> http://www.open-lab.net/blog/?p=77562 2024-06-06T16:14:28Z 2024-02-06T19:38:44Z Join us in-person or virtually and learn about the power of RAG with insights and best practices from experts at NVIDIA, visionary CEOs, data scientists, and...]]>

Join us in-person or virtually and learn about the power of RAG with insights and best practices from experts at NVIDIA, visionary CEOs, data scientists, and others.

Source

]]>
0
Chintan Patel <![CDATA[Generate Code, Answer Queries, and Translate Text with New NVIDIA AI Foundation Models]]> http://www.open-lab.net/blog/?p=77364 2024-05-07T19:14:10Z 2024-02-05T18:48:17Z This week��s Model Monday release features the NVIDIA-optimized code Llama, Kosmos-2, and SeamlessM4T, which you can experience directly from your browser....]]>

This week’s Model Monday release features the NVIDIA-optimized code Llama, Kosmos-2, and SeamlessM4T, which you can experience directly from your browser. With NVIDIA AI Foundation Models and Endpoints, you can access a curated set of community and NVIDIA-built generative AI models to experience, customize, and deploy in enterprise applications. Meta’s Code Llama 70B is the latest…

Source

]]>
0
Amit Bleiweiss <![CDATA[Deploy an AI Coding Assistant with NVIDIA TensorRT-LLM and NVIDIA Triton]]> http://www.open-lab.net/blog/?p=77200 2024-05-07T19:14:23Z 2024-02-01T21:00:00Z Large language models (LLMs) have revolutionized the field of AI, creating entirely new ways of interacting with the digital world. While they provide a good...]]>

Large language models (LLMs) have revolutionized the field of AI, creating entirely new ways of interacting with the digital world. While they provide a good generalized solution, they often must be tuned to support specific domains and tasks. AI coding assistants, or code LLMs, have emerged as one domain to help accomplish this. By 2025, 80% of the product development lifecycle will make…

Source

]]>
0
Shashank Verma <![CDATA[Query Graphs with Optimized DePlot Model]]> http://www.open-lab.net/blog/?p=77003 2024-05-07T16:48:52Z 2024-01-23T00:34:34Z NVIDIA AI Foundation Models and Endpoints provides access to a curated set of community and NVIDIA-built generative AI models to experience, customize, and...]]>

NVIDIA AI Foundation Models and Endpoints provides access to a curated set of community and NVIDIA-built generative AI models to experience, customize, and deploy in enterprise applications. On Mondays throughout the year, we’ll be releasing new models. This week, we released the NVIDIA-optimized DePlot model, which you can experience directly from your browser. If you haven’t already…

Source

]]>
0
Piotr ?elasko <![CDATA[New Support for Dutch and Persian Released by NVIDIA NeMo ASR]]> http://www.open-lab.net/blog/?p=76636 2024-02-08T18:52:04Z 2024-01-16T18:29:16Z Breaking barriers in speech recognition, NVIDIA NeMo proudly presents pretrained models tailored for Dutch and Persian��languages often overlooked in the AI...]]>

Breaking barriers in speech recognition, NVIDIA NeMo proudly presents pretrained models tailored for Dutch and Persian—languages often overlooked in the AI landscape. These models leverage the recently introduced FastConformer architecture and were trained simultaneously with CTC and transducer objectives to maximize each model’s accuracy. Automatic speech recognition (ASR) is a…

Source

]]>
1
Pawe? Budzianowski <![CDATA[Enhancing Phone Customer Service with ASR Customization]]> http://www.open-lab.net/blog/?p=75584 2024-01-25T18:17:37Z 2024-01-09T17:00:00Z At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and...]]>

At the core of understanding people correctly and having natural conversations is automatic speech recognition (ASR). To make customer-led voice assistants and automate customer service interactions over the phone, companies must solve the unique challenge of gaining a caller’s trust through qualities such as understanding, empathy, and clarity. Telephony-bound voice is inherently challenging…

Source

]]>
0
Annamalai Chockalingam <![CDATA[Contest: Build Generative AI on NVIDIA RTX PCs]]> http://www.open-lab.net/blog/?p=76141 2024-06-06T16:19:30Z 2024-01-08T16:30:00Z NVIDIA is announcing the Generative AI on RTX PCs Developer Contest - designed to inspire innovation within the developer community. Build and submit your next...]]>

NVIDIA is announcing the Generative AI on RTX PCs Developer Contest – designed to inspire innovation within the developer community. Build and submit your next innovative generative AI projects on Windows PC with RTX Systems, and you could win an RTX 4090 GPU, a full GTC in-person conference pass, and more in great prizes.

Source

]]>
0
Seth Schneider <![CDATA[Building Lifelike Digital Avatars with NVIDIA ACE Microservices]]> http://www.open-lab.net/blog/?p=76147 2024-01-25T18:17:41Z 2024-01-08T16:30:00Z Generative AI technologies are revolutionizing how games are produced and played. Game developers are exploring how these technologies can accelerate their...]]>

Generative AI technologies are revolutionizing how games are produced and played. Game developers are exploring how these technologies can accelerate their content pipelines and provide new gameplay experiences previously thought impossible. One area of focus, digital avatars, will have a transformative impact on how gamers will interact with non-playable characters (NPCs). Historically…

Source

]]>
0
Annamalai Chockalingam <![CDATA[Supercharging LLM Applications on Windows PCs with NVIDIA RTX Systems]]> http://www.open-lab.net/blog/?p=76174 2024-11-14T16:11:22Z 2024-01-08T16:30:00Z Large language models (LLMs) are fundamentally changing the way we interact with computers. These models are being incorporated into a wide range of...]]>

Large language models (LLMs) are fundamentally changing the way we interact with computers. These models are being incorporated into a wide range of applications, from internet search to office productivity tools. They are advancing real-time content generation, text summarization, customer service chatbots, and question-answering use cases. Today, LLM-powered applications are running…

Source

]]>
0
Jesse Clayton <![CDATA[Get Started with Generative AI Development for Windows PCs with NVIDIA RTX]]> http://www.open-lab.net/blog/?p=76227 2024-11-14T16:14:11Z 2024-01-08T16:30:00Z Generative AI and large language models (LLMs) are changing human-computer interaction as we know it. Many use cases would benefit from running LLMs locally on...]]>

Generative AI and large language models (LLMs) are changing human-computer interaction as we know it. Many use cases would benefit from running LLMs locally on Windows PCs, including gaming, creativity, productivity, and developer experiences. This post discusses several NVIDIA end-to-end developer tools for creating and deploying both text-based and visual LLM applications on NVIDIA RTX AI-ready…

Source

]]>
7
Hayden Wolff <![CDATA[RAG 101: Retrieval-Augmented Generation Questions Answered]]> http://www.open-lab.net/blog/?p=75743 2024-11-20T23:02:36Z 2023-12-18T19:44:42Z Data scientists, AI engineers, MLOps engineers, and IT infrastructure professionals must consider a variety of factors when designing and deploying a RAG...]]>

Data scientists, AI engineers, MLOps engineers, and IT infrastructure professionals must consider a variety of factors when designing and deploying a RAG pipeline: from core components like LLM to evaluation approaches. The key point is that RAG is a system, not just a model or set of models. This system consists of several stages, which were discussed at a high level in RAG 101…

Source

]]>
2
Hayden Wolff <![CDATA[RAG 101: Demystifying Retrieval-Augmented Generation Pipelines]]> http://www.open-lab.net/blog/?p=75493 2024-08-22T21:46:12Z 2023-12-18T19:44:31Z Large language models (LLMs) have impressed the world with their unprecedented capabilities to comprehend and generate human-like responses. Their chat...]]>

Large language models (LLMs) have impressed the world with their unprecedented capabilities to comprehend and generate human-like responses. Their chat functionality provides a fast and natural interaction between humans and large corpora of data. For example, they can summarize and extract highlights from data or replace complex queries such as SQL queries with natural language.

Source

]]>
1
Ike Nnoli <![CDATA[Create Lifelike Avatars with AI Animation and Speech Features in NVIDIA ACE]]> http://www.open-lab.net/blog/?p=74159 2024-11-20T23:02:47Z 2023-12-04T22:00:00Z NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered...]]>

NVIDIA today unveiled major upgrades to the NVIDIA Avatar Cloud Engine (ACE) suite of technologies, bringing enhanced realism and accessibility to AI-powered avatars and digital humans. These latest animation and speech capabilities enable more natural conversations and emotional expressions. Developers can now easily implement and scale intelligent avatars across applications using new…

Source

]]>
0
Mohamed Elshenawy <![CDATA[Boost Meeting Productivity with AI-Powered Note-Taking and Summarization]]> http://www.open-lab.net/blog/?p=73964 2023-12-14T19:27:34Z 2023-11-29T21:00:00Z Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and...]]>

Meetings are the lifeblood of an organization. They foster collaboration and informed decision-making. They eliminate silos through brainstorming and problem-solving. And they further strategic goals and planning. Yet, leading meetings that accomplish these goals—especially those involving cross-functional teams and external participants—can be challenging. A unique blend of people…

Source

]]>
0
Zhilin Wang <![CDATA[Announcing HelpSteer: An Open-Source Dataset for Building Helpful LLMs]]> http://www.open-lab.net/blog/?p=73937 2024-01-03T23:48:02Z 2023-11-27T17:00:00Z NVIDIA recently announced the NVIDIA NeMo SteerLM technique as part of the NVIDIA NeMo framework. This technique enables users to control large language model...]]>

NVIDIA recently announced the NVIDIA NeMo SteerLM technique as part of the NVIDIA NeMo framework. This technique enables users to control large language model (LLM) responses during inference. The developer community has shown great interest in using the approach for building custom LLMs. The NVIDIA NeMo team is now open-sourcing a multi-attribute dataset called Helpfulness SteerLM dataset…

Source

]]>
0
Brad Nemire <![CDATA[Early Bird Pricing Now Open for Hands-on Training at GTC]]> http://www.open-lab.net/blog/?p=73447 2024-06-06T16:22:29Z 2023-11-20T16:00:00Z Register for expert-led technical workshops at NVIDIA GTC and save with early bird pricing through February 7, 2024.]]>

Register for expert-led technical workshops at NVIDIA GTC and save with early bird pricing through February 7, 2024.

Source

]]>
0
Shashank Verma <![CDATA[Mastering LLM Techniques: Inference Optimization]]> http://www.open-lab.net/blog/?p=73739 2024-01-25T18:57:32Z 2023-11-17T15:00:00Z Stacking transformer layers to create large models results in better accuracies, few-shot learning capabilities, and even near-human emergent abilities on a...]]>

Stacking transformer layers to create large models results in better accuracies, few-shot learning capabilities, and even near-human emergent abilities on a wide range of language tasks. These foundation models are expensive to train, and they can be memory- and compute-intensive during inference (a recurring cost). The most popular large language models (LLMs) today can reach tens to hundreds of…

Source

]]>
0
Anjali Shah <![CDATA[Mastering LLM Techniques: Training?]]> http://www.open-lab.net/blog/?p=73464 2024-01-22T22:05:25Z 2023-11-16T14:00:00Z Large language models (LLMs) are a class of generative AI models built using transformer networks that can recognize, summarize, translate, predict, and...]]>

Large language models (LLMs) are a class of generative AI models built using transformer networks that can recognize, summarize, translate, predict, and generate language using very large datasets. LLMs have the promise of transforming society as we know it, yet training these foundation models is incredibly challenging. This blog articulates the basic principles behind LLMs…

Source

]]>
0
Nik Spirin <![CDATA[Mastering LLM Techniques: LLMOps]]> http://www.open-lab.net/blog/?p=73575 2023-12-08T18:53:36Z 2023-11-15T18:00:00Z Businesses rely more than ever on data and AI to innovate, offer value to customers, and stay competitive. The adoption of machine learning (ML), created a need...]]>

Businesses rely more than ever on data and AI to innovate, offer value to customers, and stay competitive. The adoption of machine learning (ML), created a need for tools, processes, and organizational principles to manage code, data, and models that work reliably, cost-effectively, and at scale. This is broadly known as machine learning operations (MLOps). The world is venturing rapidly into…

Source

]]>
0
Rich Harang <![CDATA[Best Practices for Securing LLM-Enabled Applications]]> http://www.open-lab.net/blog/?p=73609 2024-07-08T20:07:28Z 2023-11-15T18:00:00Z Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks,...]]>

Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks, including: This post walks through these security vulnerabilities in detail and outlines best practices for designing or evaluating a secure LLM-enabled application. Prompt injection is the most common and well-known…

Source

]]>
0
Nigel Nelson <![CDATA[Deploy Large Language Models at the Edge with NVIDIA IGX Orin Developer Kit]]> http://www.open-lab.net/blog/?p=72986 2024-05-02T16:47:03Z 2023-11-15T17:30:00Z As large language models (LLMs) become more powerful and techniques for reducing their computational requirements mature, two compelling questions emerge....]]>

As large language models (LLMs) become more powerful and techniques for reducing their computational requirements mature, two compelling questions emerge. First, what is the most advanced LLM that can be run and deployed at the edge? And second, how can real-world applications leverage these advancements? Running a state-of-the-art open-source LLM like Llama 2 70B, even at reduced FP16…

Source

]]>
0
���˳���97caoporen����