Uttara Kumar – NVIDIA Technical Blog

Uttara Kumar – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-20T18:45:42Z http://www.open-lab.net/blog/feed/ Uttara Kumar <![CDATA[Boost Llama Model Performance on Microsoft Azure AI Foundry with NVIDIA TensorRT-LLM]]> http://www.open-lab.net/blog/?p=97008 2025-03-20T18:45:42Z 2025-03-20T15:00:00Z

Microsoft, in collaboration with NVIDIA, announced transformative performance improvements for the Meta Llama family of models on its Azure AI Foundry platform....]]>

Microsoft, in collaboration with NVIDIA, announced transformative performance improvements for the Meta Llama family of models on its Azure AI Foundry platform. These advancements, enabled by NVIDIA TensorRT-LLM optimizations, deliver significant gains in throughput, reduced latency, and improved cost efficiency, all while preserving the quality of model outputs. With these improvements…

]]> Uttara Kumar <![CDATA[Google Cloud Run Adds Support for NVIDIA L4 GPUs, NVIDIA NIM, and Serverless AI Inference Deployments at Scale]]> http://www.open-lab.net/blog/?p=87666 2024-09-05T17:57:27Z 2024-08-21T18:00:00Z

Deploying AI-enabled applications and services presents enterprises with significant challenges: Performance is critical as it directly shapes user...]]>

Deploying AI-enabled applications and services presents enterprises with significant challenges: Addressing these challenges requires a full-stack approach that can optimize performance, manage scalability effectively, and navigate the complexities of deployment, enabling organizations to maximize AI’s full potential while maintaining operational efficiency and cost-effectiveness.

]]> Uttara Kumar <![CDATA[Protecting Sensitive Data and AI Models with Confidential Computing]]> http://www.open-lab.net/blog/?p=65575 2023-12-05T18:57:47Z 2023-05-31T18:47:55Z

Rapid digital transformation has led to an explosion of sensitive data being generated across the enterprise. That data has to be stored and processed in data...]]>

Rapid digital transformation has led to an explosion of sensitive data being generated across the enterprise. That data has to be stored and processed in data centers on-premises, in the cloud, or at the edge. Examples of activities that generate sensitive and personally identifiable information (PII) include credit card transactions, medical imaging or other diagnostic tests, insurance claims…

]]> 0 Uttara Kumar <![CDATA[Building a Speech-Enabled AI Virtual Assistant with NVIDIA Riva on Amazon EC2]]> http://www.open-lab.net/blog/?p=50606 2023-03-14T18:54:05Z 2022-07-28T15:30:00Z

Speech AI can assist human agents in contact centers, power virtual assistants and digital avatars, generate live captioning in video conferencing, and much...]]>

Speech AI can assist human agents in contact centers, power virtual assistants and digital avatars, generate live captioning in video conferencing, and much more. Under the hood, these voice-based technologies orchestrate a network of automatic speech recognition (ASR) and text-to-speech (TTS) pipelines to deliver intelligent, real-time responses. Sign up for the latest Data Science news.

]]> 3 Uttara Kumar <![CDATA[Deploy AI Workloads at Scale with Bottlerocket and NVIDIA-Powered Amazon EC2 Instances]]> http://www.open-lab.net/blog/?p=44139 2022-03-10T20:09:18Z 2022-03-08T00:28:25Z

Deploying AI-powered services like voice-based assistants, e-commerce product recommendations, and contact-center automation into production at scale is...]]>

Deploying AI-powered services like voice-based assistants, e-commerce product recommendations, and contact-center automation into production at scale is challenging. Delivering the best end-user experience while reducing operational costs requires accounting for multiple factors. These include composition and performance of underlying infrastructure, flexibility to scale resources based on user…

]]> 0 Uttara Kumar <![CDATA[AWS Launches First NVIDIA GPU-Accelerated Graviton-Based Instance with Amazon EC2 G5g]]> http://www.open-lab.net/blog/?p=41688 2022-08-21T23:53:09Z 2021-11-29T17:57:46Z

Today at AWS re:Invent 2021, AWS announced the general availability of Amazon EC2 G5g instances��bringing the first NVIDIA GPU-accelerated Arm-based instance...]]>

Today at AWS re:Invent 2021, AWS announced the general availability of Amazon EC2 G5g instances—bringing the first NVIDIA GPU-accelerated Arm-based instance to the AWS cloud. The new EC2 G5g instance features AWS Graviton2 processors, based on the 64-bit Arm Neoverse cores, and NVIDIA T4G Tensor Core GPUs, enhanced for graphics-intensive applications. This powerful combination creates an…

]]> 0 Uttara Kumar <![CDATA[AWS Brings NVIDIA A10G Tensor Core GPUs to the Cloud with New EC2 G5 Instances]]> http://www.open-lab.net/blog/?p=41118 2023-03-22T01:16:44Z 2021-11-12T01:46:12Z

Today, AWS announced the general availability of the new Amazon EC2 G5 instances, powered by NVIDIA A10G Tensor Core GPUs. These instances are designed for the...]]>

Today, AWS announced the general availability of the new Amazon EC2 G5 instances, powered by NVIDIA A10G Tensor Core GPUs. These instances are designed for the most demanding graphics-intensive applications, as well as machine learning inference and training simple to moderately complex machine learning models on the AWS cloud. The new EC2 G5 instances feature up to eight NVIDIA A10G Tensor…

]]> 0 Uttara Kumar <![CDATA[One-click Deployment of NVIDIA Triton Inference Server to Simplify AI Inference on Google Kubernetes Engine (GKE)]]> http://www.open-lab.net/blog/?p=36650 2022-11-14T21:40:49Z 2021-08-23T20:30:29Z

The rapid growth in artificial intelligence is driving up the size of data sets, as well as the size and complexity of networks. AI-enabled applications like...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. The rapid growth in artificial intelligence is driving up the size of data sets, as well as the size and complexity of networks. AI-enabled applications like e-commerce product recommendations, voice-based assistants, and contact center automation…

]]> 0 Uttara Kumar <![CDATA[MLOps Made Simple & Cost Effective with Google Kubernetes Engine and NVIDIA A100 Multi-Instance GPUs]]> http://www.open-lab.net/blog/?p=30918 2024-10-28T19:09:18Z 2021-05-03T16:29:00Z

Building, deploying, and managing end-to-end ML pipelines in production, particularly for applications like recommender systems is challenging. Operationalizing...]]>

Building, deploying, and managing end-to-end ML pipelines in production, particularly for applications like recommender systems is challenging. Operationalizing ML models, within enterprise applications, to deliver business value involves a lot more than developing the machine learning algorithms and models themselves – it’s a continuous process of data collection and preparation, model building…

]]> 0 ��˳��97caoporen��