AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on expert reasoning, enabling smarter planning and efficient execution. Agentic AI applications could benefit from the capabilities of models such as DeepSeek-R1. Built for solving problems that require advanced AI reasoning…
]]>Large language models (LLMs) that specialize in coding have been steadily adopted into developer workflows. From pair programming to self-improving AI agents, these models assist developers with various tasks, including enhancing code, fixing bugs, generating tests, and writing documentation. To promote the development of open-source LLMs, the Qwen team recently released Qwen2.5-Coder…
]]>Agentic AI, the next wave of generative AI, is a paradigm shift with the potential to revolutionize industries by enabling AI systems to act autonomously and achieve complex goals. Agentic AI combines the power of large language models (LLMs) with advanced reasoning and planning capabilities, opening a world of possibilities across industries, from healthcare and finance to manufacturing and…
]]>Generative AI has rapidly evolved from text-based models to multimodal capabilities. These models perform tasks like image captioning and visual question answering, reflecting a shift toward more human-like AI. The community is now expanding from text and images to video, opening new possibilities across industries. Video AI models are poised to revolutionize industries such as robotics…
]]>Today, IBM released the third generation of IBM Granite, a collection of open language models and complementary tools. Prior generations of Granite focused on domain-specific use cases; the latest IBM Granite models meet or exceed the performance of leading similarly sized open models across both academic and enterprise benchmarks. The developer-friendly Granite 3.0 generative AI models are…
]]>Mathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science.
]]>Reinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the most capable LLMs, including ChatGPT, Claude, and Nemotron families, to generate exceptional responses. By integrating human feedback into the training process, RLHF enables models to learn more nuanced behaviors and make decisions that…
]]>Llama 3.1 Nemotron 70B Reward model helps generate high-quality training data that aligns with human preferences for finance, retail, healthcare, scientific research, telecommunications, and sovereign AI.
]]>Dracarys, fine-tuned from Llama 3.1 70B and available from NVIDIA NIM microservice, supports a variety of applications, including data analysis, text summarization, and multi-language support.
]]>The new model by Mistral excels at a variety of complex tasks including text summarization, multilingual translation and reasoning, programming, question and answering, and conversational AI.
]]>NVIDIA collaborated with Mistral to co-build the next-generation language model that achieves leading performance across benchmarks in its class. With a growing number of language models purpose-built for select tasks, NVIDIA Research and Mistral AI combined forces to offer a versatile, open language model that’s performant and runs on a single GPU, such as NVIDIA A100 or H100 GPUs.
]]>In the rapidly evolving field of generative AI, coding models have become indispensable tools for developers, enhancing productivity and precision in software development. They provide significant benefits by automating complex tasks, enhancing scalability, and fostering innovation, making them invaluable tools in modern software development. This post explores the benefits of Codestral Mamba…
]]>Synthetic data isn’t about creating new information. It’s about transforming existing information to create different variants. For over a decade, synthetic data has been used to improve model accuracy across the board—whether it is transforming images to improve object detection models, strengthening fraudulent credit card detection, or improving BERT models for QA. What’s new?
]]>The newly unveiled Llama 3.1 collection of 8B, 70B, and 405B large language models (LLMs) is narrowing the gap between proprietary and open-source models. Their open nature is attracting more developers and enterprises to integrate these models into their AI applications. These models excel at various tasks including content generation, coding, and deep reasoning, and can be used to power…
]]>Deepseek Coder v2, available as an NVIDIA NIM microservice, enhances project-level coding and infilling tasks.
]]>Trained on 600+ programming languages, StarCoder2-15B is now packaged as a NIM inference microservice available for free from the NVIDIA API catalog.
]]>As generative AI experiences rapid growth, the community has stepped up to foster this expansion in two significant ways: swiftly publishing state-of-the-art foundational models, and streamlining their integration into application development and production. NVIDIA is aiding this effort by optimizing foundation models to enhance performance, allowing enterprises to generate tokens faster…
]]>Generative AI is revolutionizing virtually every use case across every industry, thanks to the constant influx of groundbreaking foundation models capable of understanding context and reason to generate quality content and high-accuracy answers. NVIDIA is constantly optimizing and publishing community–, partner–, and NVIDIA-built models. This week’s release features two families…
]]>At the recent World Governments Summit in Dubai, NVIDIA CEO Jensen Huang emphasized the importance of sovereign AI, which refers to a nation’s capability to develop and deploy AI technologies. Nations have started building regional large language models (LLMs) that codify their culture, history, and intelligence and serve their citizens with the benefits of generative AI.
]]>This week’s model release features DBRX, a state-of-the-art large language model (LLM) developed by Databricks. With demonstrated strength in programming and coding tasks, DBRX is adept at handling specialized topics and writing specific algorithms in languages like Python. It can also be used for text completion tasks and few-turn interactions. DBRX long-context abilities can be used in RAG…
]]>Large language models (LLMs) have revolutionized natural language processing (NLP) in recent years, enabling a wide range of applications such as text summarization, question answering, and natural language generation. Arctic, developed by Snowflake, is a new open LLM designed to achieve high inference performance while maintaining low cost on various NLP tasks. Arctic Arctic is…
]]>This week’s model release features two new NVIDIA AI Foundation models, Mistral Large and Mixtral 8x22B, both developed by Mistral AI. These cutting-edge text-generation AI models are supported by NVIDIA NIM microservices, which provide prebuilt containers powered by NVIDIA inference software that enable developers to reduce deployment times from weeks to minutes. Both models are available through…
]]>This week’s model release features the NVIDIA-optimized language model Smaug 72B, which you can experience directly from your browser. NVIDIA AI Foundation Models and Endpoints are a curated set of community and NVIDIA-built generative AI models to experience, customize, and deploy in enterprise applications. Try leading models such as Nemotron-3, Mixtral 8x7B, Gemma 7B…
]]>Coding is essential in the digital age, but it can also be tedious and time-consuming. That’s why many developers are looking for ways to automate and streamline their coding tasks with the help of large language models (LLMs). These models are trained on massive amounts of code from permissively licensed GitHub repositories and can generate, analyze, and document code with little human…
]]>This week’s model release features the NVIDIA-optimized language model Phi-2, which can be used for a wide range of natural language processing (NLP) tasks. You can experience Phi-2 directly from your browser. NVIDIA AI Foundation Models and Endpoints are a curated set of community and NVIDIA-built generative AI models to experience, customize, and deploy in enterprise applications.
]]>This week’s release features the NVIDIA-optimized Mamba-Chat model, which you can experience directly from your browser. This post is part of Model Mondays, a program focused on enabling easy access to state-of-the-art community and NVIDIA-built models. These models are optimized by NVIDIA using TensorRT-LLM and offered as .nemo files for easy customization and deployment.
]]>This week’s Model Monday release features the NVIDIA-optimized code Llama, Kosmos-2, and SeamlessM4T, which you can experience directly from your browser. With NVIDIA AI Foundation Models and Endpoints, you can access a curated set of community and NVIDIA-built generative AI models to experience, customize, and deploy in enterprise applications. Meta’s Code Llama 70B is the latest…
]]>NVIDIA AI Foundation Models and Endpoints provides access to a curated set of community and NVIDIA-built generative AI models to experience, customize, and deploy in enterprise applications. On Mondays throughout the year, we’ll be releasing new models. This week, we released the NVIDIA-optimized DePlot model, which you can experience directly from your browser. If you haven’t already…
]]>Generative AI has become a transformative force of our era, empowering organizations spanning every industry to achieve unparalleled levels of productivity, elevate customer experiences, and deliver superior operational efficiencies. Large language models (LLMs) are the brains behind generative AI. Access to incredibly powerful and knowledgeable foundation models, like Llama and Falcon…
]]>Large language models (LLMs) are becoming an integral tool for businesses to improve their operations, customer interactions, and decision-making processes. However, off-the-shelf LLMs often fall short in meeting the specific needs of enterprises due to industry-specific terminology, domain expertise, or unique requirements. This is where custom LLMs come into play.
]]>New SDKs are available in the NGC catalog, a hub of GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software containers, pretrained models, industry-specific SDKs, and Jupyter notebooks available, AI developers and data scientists can simplify and reduce complexities in their end-to-end workflows. This post provides an overview of new and updated…
]]>Watch this On-Demand webinar, Build A Computer Vision Application with NVIDIA AI on Google Cloud Vertex AI, where we walk you step-by-step through using these resources to build your own action recognition application. Advances in computer vision models are providing deeper insights to make our lives increasingly productive, our communities safer, and our planet cleaner. We’ve come a…
]]>The NGC catalog is a hub for GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software containers, pretrained models, industry-specific SDKs, and Jupyter Notebooks, the content helps simplify and accelerate end-to-end workflows. There are new features, software, and updates to help you streamline your workflow and build your solutions faster on NGC.
]]>Developing AI with your favorite tool, Jupyter Notebooks, just got easier due to a partnership between NVIDIA and Google Cloud. The NVIDIA NGC catalog offers GPU-optimized frameworks, SDKs, pretrained AI models, and example notebooks to help you build AI solutions faster. To further speed up your development workflow, a simplified deployment of this software with the NGC catalog’s new one…
]]>The NVIDIA NGC catalog is a hub for GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software containers, pretrained models, industry-specific SDKs, and Jupyter Notebooks the content helps simplify and accelerate end-to-end workflows. New features, software, and updates to help you streamline your workflow and build your solutions faster on NGC…
]]>AI is driving the fourth Industrial Revolution with machines that can hear, see, understand, analyze, and then make smart decisions at superhuman levels. However, the effectiveness of AI depends on the quality of the underlying models. So, whether you’re an academic researcher or a data scientist, you want to quickly build models with a variety of parameters and identify the most effective ones…
]]>The NVIDIA NGC catalog is a hub for GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software containers, pretrained models, industry-specific SDKs, and Helm Charts the content available on the catalog helps simplify and accelerate end-to-end workflows. A few additions and software updates to the NGC catalog include: NVIDIA NeMo (Neural…
]]>At GTC ’21, experts presented a variety of technical talks to help people new to AI, or just those looking for tools to speed-up their AI development using the various components of the NGC catalog, including: Watch these on-demand sessions to learn how to build solutions in the cloud with NVIDIA AI software from NGC. Building a Text-to-Speech Service that Sounds Like You This…
]]>Enterprises across industries are leveraging natural language process (NLP) solutions—from chatbots to audio transcription—to improve customer engagement, increase employee productivity, and drive revenue growth. NLP is one of the most challenging tasks for AI because it must understand the underlying context of text without explicit rules in human language. Building an AI-powered solution…
]]>Conversational AI solutions such as chatbots are now deployed in the data center, on the cloud, and at the edge to deliver lower latency and high quality of service while meeting an ever-increasing demand. The strategic decision to run AI inference on any or all these compute platforms varies not only by the use case but also evolves over time with the business. Hence…
]]>Bare-metal installations of HPC applications on a shared system require system administrators to build environment modules for 100s of applications which is complicated, high maintenance, and time consuming. Furthermore, upgrading an application to the latest revision requires carefully updating the environment modules. Networks of dependencies often break during new installs while upgrades…
]]>