The rapid evolution of AI models has driven the need for more efficient and scalable inferencing solutions. As organizations strive to harness the power of AI, they face challenges in deploying, managing, and scaling AI inference workloads. NVIDIA NIM and Google Kubernetes Engine (GKE) together offer a powerful solution to address these challenges. NVIDIA has collaborated with Google Cloud to…
]]>Deploying AI-enabled applications and services presents enterprises with significant challenges: Addressing these challenges requires a full-stack approach that can optimize performance, manage scalability effectively, and navigate the complexities of deployment, enabling organizations to maximize AI’s full potential while maintaining operational efficiency and cost-effectiveness.
]]>Large language models (LLMs) are revolutionizing data science, enabling advanced capabilities in natural language understanding, AI, and machine learning. Custom LLMs, tailored for domain-specific insights, are finding increased traction in enterprise applications. The NVIDIA Nemotron-3 8B family of foundation models is a powerful new tool for building production-ready generative AI…
]]>Generative AI is revolutionizing how organizations across all industries are leveraging data to increase productivity, advance personalized customer engagement, and foster innovation. Given its tremendous value, enterprises are looking for tools and expertise that help them integrate this new technology into their business operations and strategies effectively and reliably.
]]>Organizations are increasingly adopting hybrid and multi-cloud strategies to access the latest compute resources, consistently support worldwide customers, and optimize cost. However, a major challenge that engineering teams face is operationalizing AI applications across different platforms as the stack changes. This requires MLOps teams to familiarize themselves with different environments and…
]]>Watch this On-Demand webinar, Build A Computer Vision Application with NVIDIA AI on Google Cloud Vertex AI, where we walk you step-by-step through using these resources to build your own action recognition application. Advances in computer vision models are providing deeper insights to make our lives increasingly productive, our communities safer, and our planet cleaner. We’ve come a…
]]>In the past several months, many of us have grown accustomed to seeing our doctors over a video call. It’s certainly convenient, but after the call ends, those important pieces of advice from your doctor start to slip away. What was that new medication I needed to take? Were there any side effects to watch out for? Conversational AI can help in building an application to transcribe speech as…
]]>With audio and video streaming, conferencing, and telecommunication on the rise, it has become essential for developers to build applications with outstanding audio quality and enable end users to communicate and collaborate effectively. Various background noises can disrupt communication, ranging from traffic and construction to dogs barking and babies crying. Moreover, a user could talk in a…
]]>This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. When deploying a neural network, it’s useful to think about how the network could be made to run faster or take less space. A more efficient network can make better…
]]>Recommender systems are a critical resource for enterprises that are relentlessly striving to improve customer engagement. They work by suggesting potentially relevant products and services amongst an overwhelmingly large and ever-increasing number of offerings. NVIDIA Merlin is an application framework that accelerates all phases of recommender system development on NVIDIA GPUs…
]]>NVIDIA NeMo is a conversational AI toolkit built for researchers working on automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech synthesis (TTS). The primary objective of NeMo is to help researchers from industry and academia to reuse prior work (code and pretrained models and make it easier to create new conversational AI models. NeMo is an open-source project…
]]>Conversational AI solutions such as chatbots are now deployed in the data center, on the cloud, and at the edge to deliver lower latency and high quality of service while meeting an ever-increasing demand. The strategic decision to run AI inference on any or all these compute platforms varies not only by the use case but also evolves over time with the business. Hence…
]]>Seamlessly deploying AI services at scale in production is as critical as creating the most accurate AI model. Conversational AI services, for example, need multiple models handling functions of automatic speech recognition (ASR), natural language understanding (NLU), and text-to-speech (TTS) to complete the application pipeline. To provide real-time conversation to users…
]]>Natural language processing (NLP) is one of the most challenging tasks for AI because it needs to understand context, phonics, and accent to convert human speech into text. Building this AI workflow starts with training a model that can understand and process spoken language to text. BERT is one of the best models for this task. Instead of starting from scratch to build state-of-the-art…
]]>