Best practice – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-03-24T20:52:54Z http://www.open-lab.net/blog/feed/ Vishal Ganeriwala <![CDATA[Seamlessly Scale AI Across Cloud Environments with NVIDIA DGX Cloud Serverless Inference]]> http://www.open-lab.net/blog/?p=97192 2025-03-20T17:07:54Z 2025-03-18T21:22:51Z NVIDIA DGX Cloud Serverless Inference is an auto-scaling AI inference solution that enables application deployment with speed and reliability. Powered by NVIDIA...]]> NVIDIA DGX Cloud Serverless Inference is an auto-scaling AI inference solution that enables application deployment with speed and reliability. Powered by NVIDIA...

NVIDIA DGX Cloud Serverless Inference is an auto-scaling AI inference solution that enables application deployment with speed and reliability. Powered by NVIDIA Cloud Functions (NVCF), DGX Cloud Serverless Inference abstracts multi-cluster infrastructure setups across multi-cloud and on-premises environments for GPU-accelerated workloads. Whether managing AI workloads��

Source

]]>
0
Emily Potyraj <![CDATA[Measure and Improve AI Workload Performance with NVIDIA DGX Cloud Benchmarking]]> http://www.open-lab.net/blog/?p=97548 2025-03-20T17:07:42Z 2025-03-18T21:21:17Z As AI capabilities advance, understanding the impact of hardware and software infrastructure choices on workload performance is crucial for both technical...]]> As AI capabilities advance, understanding the impact of hardware and software infrastructure choices on workload performance is crucial for both technical...

As AI capabilities advance, understanding the impact of hardware and software infrastructure choices on workload performance is crucial for both technical validation and business planning. Organizations need a better way to assess real-world, end-to-end AI workload performance and the total cost of ownership rather than just comparing raw FLOPs or hourly cost per GPU.

Source

]]>
0
Chris Alexiuk <![CDATA[Build Enterprise AI Agents with Advanced Open NVIDIA Llama Nemotron Reasoning Models]]> http://www.open-lab.net/blog/?p=97155 2025-03-20T18:07:52Z 2025-03-18T19:05:49Z Organizations are embracing AI agents to enhance productivity and streamline operations. To maximize their impact, these agents need strong reasoning abilities...]]> Organizations are embracing AI agents to enhance productivity and streamline operations. To maximize their impact, these agents need strong reasoning abilities...

Organizations are embracing AI agents to enhance productivity and streamline operations. To maximize their impact, these agents need strong reasoning abilities to navigate complex problems, uncover hidden connections, and make logical decisions autonomously in dynamic environments. Due to their ability to tackle complex problems, reasoning models have become a key part of the agentic AI��

Source

]]>
0
Chen Fu <![CDATA[Streamline LLM Deployment for Autonomous Vehicle Applications with NVIDIA DriveOS LLM SDK]]> http://www.open-lab.net/blog/?p=96776 2025-03-07T20:13:46Z 2025-03-10T19:30:00Z Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of...]]> Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of...

Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of applications, including translation, digital assistants, recommendation systems, context analysis, code generation, cybersecurity, and more. In automotive applications, there is growing demand for LLM-based solutions for both autonomous driving and��

Source

]]>
2
Shelby Thomas <![CDATA[Ensuring Reliable Model Training on NVIDIA DGX Cloud]]> http://www.open-lab.net/blog/?p=96789 2025-03-24T18:36:43Z 2025-03-10T16:26:44Z Training AI models on massive GPU clusters presents significant challenges for model builders. Because manual intervention becomes impractical as job scale...]]> Training AI models on massive GPU clusters presents significant challenges for model builders. Because manual intervention becomes impractical as job scale...Image shows cloud-based GPU clusters dedicated to AI training.

Training AI models on massive GPU clusters presents significant challenges for model builders. Because manual intervention becomes impractical as job scale increases, automation is critical to maintaining high GPU utilization and training productivity. An exceptional training experience requires resilient systems that provide low-latency error attribution and automatic fail over based on root��

Source

]]>
0
Douglas Moore <![CDATA[Accelerate Medical Imaging AI Operations with Databricks Pixels 2.0 and MONAI]]> http://www.open-lab.net/blog/?p=96530 2025-03-06T19:26:39Z 2025-02-28T18:11:50Z According to the World Health Organization (WHO), 3.6 billion medical imaging tests are performed every year globally to diagnose, monitor, and treat various...]]> According to the World Health Organization (WHO), 3.6 billion medical imaging tests are performed every year globally to diagnose, monitor, and treat various...

According to the World Health Organization (WHO), 3.6 billion medical imaging tests are performed every year globally to diagnose, monitor, and treat various conditions. Most of these images are stored in a globally recognized standard called DICOM (Digital Imaging and Communications in Medicine). Imaging studies in DICOM format are a combination of unstructured images and structured metadata.

Source

]]>
0
Leigh Engel <![CDATA[Simplify System Memory Management with the Latest NVIDIA GH200 NVL2 Enterprise RA]]> http://www.open-lab.net/blog/?p=96079 2025-02-20T15:53:06Z 2025-02-13T21:26:30Z NVIDIA Enterprise Reference Architectures (Enterprise RAs) can reduce the time and cost of deploying AI infrastructure solutions. They provide a streamlined...]]> NVIDIA Enterprise Reference Architectures (Enterprise RAs) can reduce the time and cost of deploying AI infrastructure solutions. They provide a streamlined...

NVIDIA Enterprise Reference Architectures (Enterprise RAs) can reduce the time and cost of deploying AI infrastructure solutions. They provide a streamlined approach for building flexible and cost-effective accelerated infrastructure while ensuring compatibility and interoperability. The latest Enterprise RA details an optimized cluster configuration for systems integrated with NVIDIA GH200��

Source

]]>
0
Allison Ding <![CDATA[Get Started with GPU Acceleration for Data Science]]> http://www.open-lab.net/blog/?p=95894 2025-02-20T15:55:08Z 2025-02-06T23:07:48Z In data science, operational efficiency is key to handling increasingly complex and large datasets. GPU acceleration has become essential for modern workflows,...]]> In data science, operational efficiency is key to handling increasingly complex and large datasets. GPU acceleration has become essential for modern workflows,...

In data science, operational efficiency is key to handling increasingly complex and large datasets. GPU acceleration has become essential for modern workflows, offering significant performance improvements. RAPIDS is a suite of open-source libraries and frameworks developed by NVIDIA, designed to accelerate data science pipelines using GPUs with minimal code changes.

Source

]]>
0
David Hart <![CDATA[Render Path-Traced Hair in Real Time with NVIDIA GeForce RTX 50 Series GPUs]]> http://www.open-lab.net/blog/?p=95790 2025-02-07T18:41:50Z 2025-02-06T20:30:00Z Hardware support for ray tracing triangle meshes was introduced as part of NVIDIA RTX in 2018. But ray tracing for hair and fur has remained a compute-intensive...]]> Hardware support for ray tracing triangle meshes was introduced as part of NVIDIA RTX in 2018. But ray tracing for hair and fur has remained a compute-intensive...

Hardware support for ray tracing triangle meshes was introduced as part of NVIDIA RTX in 2018. But ray tracing for hair and fur has remained a compute-intensive problem that has been difficult to further accelerate. That is, until now. NVIDIA GeForce 50 Series GPUs include a major advancement in the acceleration of ray tracing for hair and fur: hardware ray tracing support for the linear��

Source

]]>
0
Christoph Kubisch <![CDATA[NVIDIA RTX Mega Geometry Now Available with New Vulkan Samples]]> http://www.open-lab.net/blog/?p=95842 2025-02-13T18:21:42Z 2025-02-06T18:29:20Z Geometric detail in computer graphics has increased exponentially in the past 30 years. To render high quality assets with higher instance counts and greater...]]> Geometric detail in computer graphics has increased exponentially in the past 30 years. To render high quality assets with higher instance counts and greater...

Geometric detail in computer graphics has increased exponentially in the past 30 years. To render high quality assets with higher instance counts and greater triangle density, NVIDIA introduced RTX Mega Geometry. RTX Mega Geometry is available today through NVIDIA RTX Kit, a suite of rendering technologies to ray trace games with AI, render scenes with immense geometry, and create game characters��

Source

]]>
0
Shruthii Sathyanarayanan <![CDATA[Streamline Collaboration Across Local and Cloud Systems with NVIDIA AI Workbench]]> http://www.open-lab.net/blog/?p=95720 2025-02-06T19:35:45Z 2025-02-05T18:00:00Z NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a...]]> NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a...

NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a frictionless experience across PCs, workstations, servers, and cloud for AI, data science, and machine learning (ML) projects. The user experience includes: This post provides details about the January 2025 release of NVIDIA AI Workbench��

Source

]]>
0
Jonathan Litt <![CDATA[Build Apps with Neural Rendering Using NVIDIA Nsight Developer Tools on GeForce RTX 50 Series GPUs]]> http://www.open-lab.net/blog/?p=95580 2025-02-06T19:33:55Z 2025-01-30T21:11:00Z The next generation of NVIDIA graphics hardware has arrived. Powered by NVIDIA Blackwell, GeForce RTX 50 Series GPUs deliver groundbreaking new RTX features...]]> The next generation of NVIDIA graphics hardware has arrived. Powered by NVIDIA Blackwell, GeForce RTX 50 Series GPUs deliver groundbreaking new RTX features...

The next generation of NVIDIA graphics hardware has arrived. Powered by NVIDIA Blackwell, GeForce RTX 50 Series GPUs deliver groundbreaking new RTX features such as DLSS 4 with Multi Frame Generation, and NVIDIA RTX Kit with RTX Mega Geometry and RTX Neural Shaders. NVIDIA RTX Blackwell architecture introduces fifth-generation Tensor Cores to drive AI workloads and fourth-generation RT Cores with��

Source

]]>
0
Amit Bleiweiss <![CDATA[Mastering LLM Techniques: Evaluation]]> http://www.open-lab.net/blog/?p=95447 2025-02-17T05:21:53Z 2025-01-29T20:44:06Z Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and...]]> Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and...

Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and multifaceted nature of these systems. Unlike traditional machine learning (ML) models, LLMs generate a wide range of diverse and often unpredictable outputs, making standard evaluation metrics insufficient. Key challenges include the��

Source

]]>
0
Nick Comly <![CDATA[Optimize AI Inference Performance with NVIDIA Full-Stack Solutions]]> http://www.open-lab.net/blog/?p=95310 2025-03-18T18:18:44Z 2025-01-24T16:00:00Z The explosion of AI-driven applications has placed unprecedented demands on both developers, who must balance delivering cutting-edge performance with managing...]]> The explosion of AI-driven applications has placed unprecedented demands on both developers, who must balance delivering cutting-edge performance with managing...

As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. The explosion of AI-driven applications has placed unprecedented demands on both developers, who must balance delivering cutting-edge performance with managing operational complexity and cost, and AI infrastructure. NVIDIA is empowering developers with full-stack innovations��spanning chips, systems��

Source

]]>
0
Chris Krapu <![CDATA[Lessons Learned from Building an AI Sales Assistant]]> http://www.open-lab.net/blog/?p=95231 2025-02-06T19:34:04Z 2025-01-21T20:34:41Z At NVIDIA, the Sales Operations team equips the Sales team with the tools and resources needed to bring cutting-edge hardware and software to market. Managing...]]> At NVIDIA, the Sales Operations team equips the Sales team with the tools and resources needed to bring cutting-edge hardware and software to market. Managing...Decorative image of an AI sales assistant workflow with icons.

At NVIDIA, the Sales Operations team equips the Sales team with the tools and resources needed to bring cutting-edge hardware and software to market. Managing this across NVIDIA��s diverse technology is a complex challenge shared by many enterprises. Through collaboration with our Sales team, we found that they rely on internal and external documentation��

Source

]]>
1
John Thomson <![CDATA[Introducing New KV Cache Reuse Optimizations in NVIDIA TensorRT-LLM]]> http://www.open-lab.net/blog/?p=95040 2025-02-06T19:34:05Z 2025-01-16T22:57:30Z Language models generate text by predicting the next token, given all the previous tokens including the input text tokens. Key and value elements of the...]]> Language models generate text by predicting the next token, given all the previous tokens including the input text tokens. Key and value elements of the...

Language models generate text by predicting the next token, given all the previous tokens including the input text tokens. Key and value elements of the previous tokens are used as historical context in LLM serving for generation of the next set of tokens. Caching these key and value elements from previous tokens avoids expensive recomputation and effectively leads to higher throughput. However��

Source

]]>
0
Sama Bali <![CDATA[GPU Memory Essentials for AI Performance]]> http://www.open-lab.net/blog/?p=94979 2025-01-23T19:54:24Z 2025-01-15T16:00:00Z Generative AI has revolutionized how people bring ideas to life, and agentic AI represents the next leap forward in this technological evolution. By leveraging...]]> Generative AI has revolutionized how people bring ideas to life, and agentic AI represents the next leap forward in this technological evolution. By leveraging...

Generative AI has revolutionized how people bring ideas to life, and agentic AI represents the next leap forward in this technological evolution. By leveraging sophisticated, autonomous reasoning and iterative planning, AI agents can tackle complex, multistep problems with remarkable efficiency. As AI continues to revolutionize industries, the demand for running AI models locally has surged.

Source

]]>
0
Peter Entschev <![CDATA[Accelerating GPU Analytics Using RAPIDS and Ray]]> http://www.open-lab.net/blog/?p=94495 2024-12-20T21:13:45Z 2024-12-20T21:13:42Z RAPIDS is a suite of open-source GPU-accelerated data science and AI libraries that are well supported for scale-out with distributed engines like Spark and...]]> RAPIDS is a suite of open-source GPU-accelerated data science and AI libraries that are well supported for scale-out with distributed engines like Spark and...

RAPIDS is a suite of open-source GPU-accelerated data science and AI libraries that are well supported for scale-out with distributed engines like Spark and Dask. Ray is a popular open-source distributed Python framework commonly used to scale AI and machine learning (ML) applications. Ray particularly excels at simplifying and scaling training and inference pipelines and can easily target both��

Source

]]>
0
Japinder Singh <![CDATA[Fine-Tuning Small Language Models to Optimize Code Review Accuracy]]> http://www.open-lab.net/blog/?p=94078 2025-02-17T05:13:45Z 2024-12-17T17:58:31Z Generative AI is transforming enterprises by driving innovation and boosting efficiency across numerous applications. However, adopting large foundational...]]> Generative AI is transforming enterprises by driving innovation and boosting efficiency across numerous applications. However, adopting large foundational...

Source

]]>
0
Joseph Lucas <![CDATA[Sandboxing Agentic AI Workflows with WebAssembly]]> http://www.open-lab.net/blog/?p=93975 2024-12-16T21:06:56Z 2024-12-16T20:33:46Z Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this...]]> Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this...

Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this code should be sanitized and executed in a safe environment to mitigate risks from prompt injection and errors in the returned code. Sanitizing Python with regular expressions and restricted runtimes is insufficient��

Source

]]>
0
Tim Lustig <![CDATA[Integration of NVIDIA BlueField DPUs with WEKA Client Boosts AI Workload Efficiency]]> http://www.open-lab.net/blog/?p=93578 2024-12-12T19:35:12Z 2024-12-12T17:45:46Z WEKA, a pioneer in scalable software-defined data platforms, and NVIDIA are collaborating to unite WEKA's state-of-the-art data platform solutions with powerful...]]> WEKA, a pioneer in scalable software-defined data platforms, and NVIDIA are collaborating to unite WEKA's state-of-the-art data platform solutions with powerful...

WEKA, a pioneer in scalable software-defined data platforms, and NVIDIA are collaborating to unite WEKA��s state-of-the-art data platform solutions with powerful NVIDIA BlueField DPUs. The WEKA Data Platform advanced storage software unlocks the full potential of AI and performance-intensive workloads, while NVIDIA BlueField DPUs revolutionize data access, movement, and security.

Source

]]>
0
Jonathan Litt <![CDATA[Optimize GPU Workloads for Graphics Applications with NVIDIA Nsight Graphics]]> http://www.open-lab.net/blog/?p=93418 2024-12-12T19:35:20Z 2024-12-05T18:06:35Z One of the great pastimes of graphics developers and enthusiasts is comparing specifications of GPUs and marveling at the ever-increasing counts of shader...]]> One of the great pastimes of graphics developers and enthusiasts is comparing specifications of GPUs and marveling at the ever-increasing counts of shader...

One of the great pastimes of graphics developers and enthusiasts is comparing specifications of GPUs and marveling at the ever-increasing counts of shader cores, RT cores, teraflops, and overall computational power with each new generation. Achieving the maximum theoretical performance represented by those numbers is a major focus in the world of graphics programming. Massive amounts of rendering��

Source

]]>
0
Ben Zaitlen https://www.linkedin.com/in/benjamin-zaitlen-62ab7b4/ <![CDATA[Best Practices for Multi-GPU Data Analysis Using RAPIDS with Dask]]> http://www.open-lab.net/blog/?p=92480 2024-12-12T19:38:40Z 2024-11-21T19:02:03Z As we move towards a more dense computing infrastructure, with more compute, more GPUs, accelerated networking, and so forth��multi-gpu training and analysis...]]> As we move towards a more dense computing infrastructure, with more compute, more GPUs, accelerated networking, and so forth��multi-gpu training and analysis...

As we move towards a more dense computing infrastructure, with more compute, more GPUs, accelerated networking, and so forth��multi-gpu training and analysis grows in popularity. We need tools and also best practices as developers and practitioners move from CPU to GPU clusters. RAPIDS is a suite of open-source GPU-accelerated data science and AI libraries. These libraries can easily scale-out for��

Source

]]>
0
Mario Geiger <![CDATA[Accelerate Drug and Material Discovery with New Math Library NVIDIA cuEquivariance]]> http://www.open-lab.net/blog/?p=91896 2024-11-18T22:58:58Z 2024-11-18T18:30:00Z AI models for science are often trained to make predictions about the workings of nature, such as predicting the structure of a biomolecule or the properties of...]]> AI models for science are often trained to make predictions about the workings of nature, such as predicting the structure of a biomolecule or the properties of...

AI models for science are often trained to make predictions about the workings of nature, such as predicting the structure of a biomolecule or the properties of a new solid that can become the next battery material. These tasks require high precision and accuracy. What makes AI for science even more challenging is that highly accurate and precise scientific data is often scarce��

Source

]]>
1
Tyler Whitehouse <![CDATA[Frictionless Collaboration and Rapid Prototyping in Hybrid Environments with NVIDIA AI Workbench]]> http://www.open-lab.net/blog/?p=91234 2024-11-14T17:10:49Z 2024-11-04T17:30:00Z NVIDIA AI Workbench is a free development environment manager that streamlines data science, AI, and machine learning (ML) projects on systems of choice. The...]]> NVIDIA AI Workbench is a free development environment manager that streamlines data science, AI, and machine learning (ML) projects on systems of choice. The...

NVIDIA AI Workbench is a free development environment manager that streamlines data science, AI, and machine learning (ML) projects on systems of choice. The goal is to provide a frictionless way to create, compute, and collaborate on and across PCs, workstations, data centers, and clouds. The basic user experience is straightforward: This post explores highlights of the October release��

Source

]]>
0
Sophia Schuur <![CDATA[Protect Your Network with Secure Boot in SONiC]]> http://www.open-lab.net/blog/?p=91056 2024-10-31T19:07:37Z 2024-10-29T22:01:56Z NVIDIA technology helps organizations build and maintain secure, scalable, and high-performance network infrastructure. Advances in AI, with NVIDIA at the...]]> NVIDIA technology helps organizations build and maintain secure, scalable, and high-performance network infrastructure. Advances in AI, with NVIDIA at the...

NVIDIA technology helps organizations build and maintain secure, scalable, and high-performance network infrastructure. Advances in AI, with NVIDIA at the forefront, contribute every day to security advances. One way NVIDIA has taken a more direct approach to network security is through a secure network operating system (NOS). A secure network operating system (NOS) is a specialized type of��

Source

]]>
1
Nathan Patterson <![CDATA[Learning Fluid Flow with AI-Enabled Virtual Wind Tunnels]]> http://www.open-lab.net/blog/?p=87861 2024-11-25T17:28:18Z 2024-10-14T18:39:40Z There��s never enough time to do everything, even in engineering education. Employers want engineers capable of wielding simulation tools to expedite iterative...]]> There��s never enough time to do everything, even in engineering education. Employers want engineers capable of wielding simulation tools to expedite iterative...Image of a car with shaded lines to indicate wind pressure.

There��s never enough time to do everything, even in engineering education. Employers want engineers capable of wielding simulation tools to expedite iterative research, design, and development. Some instructors try to address this by teaching for weeks or months, on derivations of numerical methods, approaches to discretization, the intricacies of turbulence models, and more. Unfortunately��

Source

]]>
0
Annamalai Chockalingam <![CDATA[Accelerating LLMs with llama.cpp on NVIDIA RTX Systems]]> http://www.open-lab.net/blog/?p=89663 2024-11-22T23:11:17Z 2024-10-02T13:00:00Z The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...]]> The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate...

The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate into Windows applications. Notably, llama.cpp is one popular tool, with over 65K GitHub stars at the time of writing. Originally released in 2023, this open-source repository is a lightweight, efficient framework for large language model��

Source

]]>
0
Rajvir Singh <![CDATA[Optimizing Inference Efficiency for LLMs at Scale with NVIDIA NIM Microservices]]> http://www.open-lab.net/blog/?p=87091 2024-08-22T18:24:55Z 2024-08-14T19:30:00Z As large language models (LLMs) continue to evolve at an unprecedented pace, enterprises are looking to build generative AI-powered applications that maximize...]]> As large language models (LLMs) continue to evolve at an unprecedented pace, enterprises are looking to build generative AI-powered applications that maximize...

As large language models (LLMs) continue to evolve at an unprecedented pace, enterprises are looking to build generative AI-powered applications that maximize throughput to lower operational costs and minimize latency to deliver superior user experiences. This post discusses the critical performance metrics of throughput and latency for LLMs, exploring their importance and trade-offs between��

Source

]]>
0
Robert Jensen <![CDATA[Shader Debugging Made Easy with NVIDIA Nsight Graphics]]> http://www.open-lab.net/blog/?p=86432 2024-08-28T18:09:18Z 2024-07-31T16:00:00Z Shaders are specialized programs that run on the GPU that manipulate rays, pixels, vertices, and textures to achieve unique visual effects. With shaders, you...]]> Shaders are specialized programs that run on the GPU that manipulate rays, pixels, vertices, and textures to achieve unique visual effects. With shaders, you...

Shaders are specialized programs that run on the GPU that manipulate rays, pixels, vertices, and textures to achieve unique visual effects. With shaders, you can add creative expression and realism to the rendered image. They��re essential in ray tracing for simulating realistic lighting, shadows, and reflections. We love shaders, but they can be hard to debug. Shader calculations are complex��

Source

]]>
0
James Mills <![CDATA[Developing Product Configurators with OpenUSD]]> http://www.open-lab.net/blog/?p=85709 2024-08-08T18:48:33Z 2024-07-24T16:00:00Z Developers from advertising agencies to software vendors are empowering global brands to deliver hyperpersonalization for digital experiences and visual...]]> Developers from advertising agencies to software vendors are empowering global brands to deliver hyperpersonalization for digital experiences and visual...Collage of four photos of a car with different colors and roof storage.

Developers from advertising agencies to software vendors are empowering global brands to deliver hyperpersonalization for digital experiences and visual storytelling with product configurator solutions. Integrating NVIDIA Omniverse with OpenUSD and generative AI into product configurators enables solution providers and software developers to deliver interactive, ray-traced��

Source

]]>
0
Gorkem Batmaz https://twitter.com/gorkembatmaz <![CDATA[Building Cyber Language Models to Unlock New Cybersecurity Capabilities]]> http://www.open-lab.net/blog/?p=84556 2025-02-04T19:45:51Z 2024-07-09T16:00:00Z General-purpose large language models (LLMs) have proven their usefulness across various fields, offering substantial benefits in applications ranging from text...]]> General-purpose large language models (LLMs) have proven their usefulness across various fields, offering substantial benefits in applications ranging from text...An illustration showing code.

General-purpose large language models (LLMs) have proven their usefulness across various fields, offering substantial benefits in applications ranging from text generation to complex problem-solving. However, there are circumstances where developing a bespoke language model becomes not just beneficial but essential. This necessity arises particularly in specialized domains characterized by��

Source

]]>
0
Joseph Lucas <![CDATA[Secure LLM Tokenizers to Maintain Application Integrity]]> http://www.open-lab.net/blog/?p=84504 2024-07-10T15:28:33Z 2024-06-27T18:00:00Z This post is part of the NVIDIA AI Red Team��s continuing vulnerability and technique research. Use the concepts presented to responsibly assess and increase...]]> This post is part of the NVIDIA AI Red Team��s continuing vulnerability and technique research. Use the concepts presented to responsibly assess and increase...

This post is part of the NVIDIA AI Red Team��s continuing vulnerability and technique research. Use the concepts presented to responsibly assess and increase the security of your AI development and deployment processes and applications. Large language models (LLMs) don��t operate over strings. Instead, prompts are passed through an often-transparent translator called a tokenizer that creates an��

Source

]]>
0
Babak Hejazi <![CDATA[Introducing Grouped GEMM APIs in cuBLAS and More Performance Updates]]> http://www.open-lab.net/blog/?p=83888 2024-07-16T17:19:07Z 2024-06-12T20:30:00Z The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance...]]> The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance...

The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance computing (HPC) workloads. This post provides an overview of the following updates on cuBLAS matrix multiplications (matmuls) since version 12.0, and a walkthrough: Grouped GEMM APIs can be viewed as a generalization of the batched��

Source

]]>
0
Jess Nguyen <![CDATA[New Webinar: Deploying Generative AI in Production]]> http://www.open-lab.net/blog/?p=83086 2024-05-30T19:55:44Z 2024-05-29T20:00:53Z Ready to move your pilot to production? Get an expert overview on how to deploy generative AI applications.]]> Ready to move your pilot to production? Get an expert overview on how to deploy generative AI applications.

Ready to move your pilot to production? Get an expert overview on how to deploy generative AI applications.

Source

]]>
0
Amit Bleiweiss <![CDATA[Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints]]> http://www.open-lab.net/blog/?p=81895 2025-03-11T16:19:32Z 2024-05-08T16:00:00Z Retrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more...]]> Retrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more...Decorative image of a RAG pipeline.

Retrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more accurate, up-to-date, and contextually relevant responses from large language models (LLMs). By incorporating data from various sources such as relational databases, unstructured document repositories, internet data streams, and media news feeds��

Source

]]>
7
Belen Tegegn <![CDATA[Top Data Science Sessions from NVIDIA GTC 2024 Now Available On Demand]]> http://www.open-lab.net/blog/?p=81594 2024-05-02T21:34:01Z 2024-04-29T22:40:06Z At GTC 2024, experts from NVIDIA and our partners shared insights about GPU-accelerated tools, optimizations, and best practices for data scientists. From the...]]> At GTC 2024, experts from NVIDIA and our partners shared insights about GPU-accelerated tools, optimizations, and best practices for data scientists. From the...3 sessions for data scientists to watch from NVIDIA GTC 2024

At GTC 2024, experts from NVIDIA and our partners shared insights about GPU-accelerated tools, optimizations, and best practices for data scientists. From the hundreds of sessions covering various topics, we��ve handpicked the top three data science sessions that you won��t want to miss. RAPIDS in 2024: Accelerated Data Science Everywhere Speakers: Dante Gama Dessavre��

Source

]]>
0
Jon Kennedy <![CDATA[Limiting CPU Threads for Better Game Performance]]> http://www.open-lab.net/blog/?p=77628 2024-02-22T19:58:51Z 2024-02-21T17:38:17Z Many PC games are designed around an eight-core console with an assumption that their software threading system ��just works�� on all PCs, especially...]]> Many PC games are designed around an eight-core console with an assumption that their software threading system ��just works�� on all PCs, especially...Decorative image of scissors near a CPU with green light streaming out.

Many PC games are designed around an eight-core console with an assumption that their software threading system ��just works�� on all PCs, especially regarding the number of threads in the worker thread pool. This was a reasonable assumption not too long ago when most PCs had similar core counts to consoles: the CPUs were just faster and performance just scaled. In recent years though��

Source

]]>
1
Taylor Allison <![CDATA[Simplifying Network Operations for AI with NVIDIA Quantum InfiniBand]]> http://www.open-lab.net/blog/?p=76977 2024-02-08T18:51:59Z 2024-01-23T18:00:00Z A common technological misconception is that performance and complexity are directly linked. That is, the highest-performance implementation is also the most...]]> A common technological misconception is that performance and complexity are directly linked. That is, the highest-performance implementation is also the most...Photo of a person standing at a computer terminal in a data center.

A common technological misconception is that performance and complexity are directly linked. That is, the highest-performance implementation is also the most challenging to implement and manage. When considering data center networking, however, this is not the case. InfiniBand is a protocol that sounds daunting and exotic in comparison to Ethernet, but because it is built from the ground up��

Source

]]>
0
Rahul Ramasubramanian <![CDATA[Improving CUDA Initialization Times Using cgroups in Certain Scenarios]]> http://www.open-lab.net/blog/?p=75534 2024-01-11T19:49:33Z 2024-01-05T22:14:41Z Many CUDA applications running on multi-GPU platforms usually use a single GPU for their compute needs. In such scenarios, a performance penalty is paid by...]]> Many CUDA applications running on multi-GPU platforms usually use a single GPU for their compute needs. In such scenarios, a performance penalty is paid by...Decorative image of light fields in green, purple, and blue.

Many CUDA applications running on multi-GPU platforms usually use a single GPU for their compute needs. In such scenarios, a performance penalty is paid by applications because CUDA has to enumerate/initialize all the GPUs on the system. If a CUDA application does not require other GPUs to be visible and accessible, you can launch such applications by isolating the unwanted GPUs from the CUDA��

Source

]]>
0
Lars Nordskog <![CDATA[Advanced API Performance: Swap Chains]]> http://www.open-lab.net/blog/?p=74280 2023-12-11T20:20:45Z 2023-12-15T17:00:00Z Swap chains are an integral part of how you get rendering data output to a screen. They usually consist of some group of output-ready buffers, each of which can...]]> Swap chains are an integral part of how you get rendering data output to a screen. They usually consist of some group of output-ready buffers, each of which can...A graphic of a computer sending code to multiple stacks.

Swap chains are an integral part of how you get rendering data output to a screen. They usually consist of some group of output-ready buffers, each of which can be rendered to one at a time in rotation. In parallel with rendering to one of a swap chain��s buffers, some other buffer in the swap chain is generally read from for display output. This post covers best practices when working with��

Source

]]>
0
Oleg Kuznetsov <![CDATA[Advanced API Performance: Intrinsics]]> http://www.open-lab.net/blog/?p=71300 2023-12-30T00:44:05Z 2023-11-21T18:37:48Z Intrinsics can be thought of as higher-level abstractions of specific hardware instructions. They offer direct access to low-level operations or...]]> Intrinsics can be thought of as higher-level abstractions of specific hardware instructions. They offer direct access to low-level operations or...A graphic of a computer sending code to multiple stacks.

Intrinsics can be thought of as higher-level abstractions of specific hardware instructions. They offer direct access to low-level operations or hardware-specific features, enabling increased performance. In this way, operations can be performed across threads within a warp, also known as a wavefront. The following code example is an example with��

Source

]]>
0
Rich Harang <![CDATA[Best Practices for Securing LLM-Enabled Applications]]> http://www.open-lab.net/blog/?p=73609 2024-07-08T20:07:28Z 2023-11-15T18:00:00Z Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks,...]]> Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks,...

Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks, including: This post walks through these security vulnerabilities in detail and outlines best practices for designing or evaluating a secure LLM-enabled application. Prompt injection is the most common and well-known��

Source

]]>
0
Harry Petty <![CDATA[Accelerating Ptychography Workflows with NVIDIA Holoscan at Diamond Light Source]]> http://www.open-lab.net/blog/?p=72819 2023-11-16T19:16:36Z 2023-11-14T17:00:00Z Diamond Light Source is a world-renowned synchrotron facility in the UK that provides scientists with access to intense beams of x-rays, infrared, and other...]]> Diamond Light Source is a world-renowned synchrotron facility in the UK that provides scientists with access to intense beams of x-rays, infrared, and other...

Diamond Light Source is a world-renowned synchrotron facility in the UK that provides scientists with access to intense beams of x-rays, infrared, and other forms of light to study materials and biological structures. The facility boasts over 30 experimental stations or beamlines, and is home to some of the most advanced and complex scientific research projects in the world. I08-1��

Source

]]>
0
Leroy Sikkes <![CDATA[Advanced API Performance: Descriptors]]> http://www.open-lab.net/blog/?p=71317 2023-11-02T20:23:13Z 2023-10-27T16:00:00Z By using descriptor types, you can bind resources to shaders and specify how those resources are accessed. This creates efficient communication between the CPU...]]> By using descriptor types, you can bind resources to shaders and specify how those resources are accessed. This creates efficient communication between the CPU...A graphic of a computer sending code to multiple stacks.

By using descriptor types, you can bind resources to shaders and specify how those resources are accessed. This creates efficient communication between the CPU and GPU and enables shaders to access the necessary data during rendering.

Source

]]>
0
Bhumin Pathak <![CDATA[Boost Synthetic Data Generation with Low-Code Workflows in NVIDIA Omniverse Replicator 1.10]]> http://www.open-lab.net/blog/?p=71526 2023-11-02T18:14:39Z 2023-10-18T14:00:00Z Data is the lifeblood of AI systems, which rely on robust datasets to learn and make predictions or decisions. For perception AI models specifically, it is...]]> Data is the lifeblood of AI systems, which rely on robust datasets to learn and make predictions or decisions. For perception AI models specifically, it is...

Data is the lifeblood of AI systems, which rely on robust datasets to learn and make predictions or decisions. For perception AI models specifically, it is essential that data reflects real-world environments and incorporates the array of scenarios. This includes edge use cases for which data is often difficult to collect, such as street traffic and manufacturing assembly lines.

Source

]]>
0
Brian Sparks <![CDATA[Networking for Data Centers and the Era of AI]]> http://www.open-lab.net/blog/?p=71474 2023-11-02T18:14:42Z 2023-10-12T16:30:00Z Traditional cloud data centers have served as the bedrock of computing infrastructure for over a decade, catering to a diverse range of users and applications....]]> Traditional cloud data centers have served as the bedrock of computing infrastructure for over a decade, catering to a diverse range of users and applications....

Traditional cloud data centers have served as the bedrock of computing infrastructure for over a decade, catering to a diverse range of users and applications. However, data centers have evolved in recent years to keep up with advancements in technology and the surging demand for AI-driven computing. This post explores the pivotal role that networking plays in shaping the future of data centers��

Source

]]>
0
Joseph Lucas <![CDATA[Analyzing the Security of Machine Learning Research Code]]> http://www.open-lab.net/blog/?p=71113 2024-07-08T21:33:52Z 2023-10-04T18:00:00Z The NVIDIA AI Red Team is focused on scaling secure development practices across the data, science, and AI ecosystems. We participate in open-source security...]]> The NVIDIA AI Red Team is focused on scaling secure development practices across the data, science, and AI ecosystems. We participate in open-source security...

The NVIDIA AI Red Team is focused on scaling secure development practices across the data, science, and AI ecosystems. We participate in open-source security initiatives, release tools, present at industry conferences, host educational competitions, and provide innovative training. Covering 3 years and totaling almost 140GB of source code, the recently released Meta Kaggle for Code dataset is��

Source

]]>
2
Berkin Kartal <![CDATA[Comparing Solutions for Boosting Data Center Redundancy]]> http://www.open-lab.net/blog/?p=70873 2023-10-19T19:05:58Z 2023-09-29T19:46:58Z In today��s data center, there are many ways to achieve system redundancy from a server connected to a fabric. Customers usually seek redundancy to increase...]]> In today��s data center, there are many ways to achieve system redundancy from a server connected to a fabric. Customers usually seek redundancy to increase...Picture of an aisle in a data center, with servers on either side.

In today��s data center, there are many ways to achieve system redundancy from a server connected to a fabric. Customers usually seek redundancy to increase service availability (such as achieving end-to-end AI workloads) and find system efficiency using different multihoming techniques. In this post, we discuss the pros and cons of the well-known proprietary multi-chassis link aggregation��

Source

]]>
0
Zachary Bourque <![CDATA[NVIDIA CUDA Toolkit Symbol Server]]> http://www.open-lab.net/blog/?p=70493 2023-09-21T17:56:27Z 2023-09-07T19:10:21Z NVIDIA has already made available a GPU driver binary symbols server for Windows. Now, NVIDIA is making available a repository of CUDA Toolkit symbols for...]]> NVIDIA has already made available a GPU driver binary symbols server for Windows. Now, NVIDIA is making available a repository of CUDA Toolkit symbols for...Decorative image of two boxes with libcuda.sym labels.

NVIDIA has already made available a GPU driver binary symbols server for Windows. Now, NVIDIA is making available a repository of CUDA Toolkit symbols for Linux. NVIDIA is introducing CUDA Toolkit symbols for Linux for an application development enhancement. During application development, you can now download obfuscated symbols for NVIDIA libraries that are being debugged or profiled in��

Source

]]>
2
Johannes Deligiannis <![CDATA[Advanced API Performance: Shaders]]> http://www.open-lab.net/blog/?p=70243 2023-10-25T23:52:32Z 2023-09-01T15:36:30Z This post covers best practices when working with shaders on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced...]]> This post covers best practices when working with shaders on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced...A graphic of a computer sending code to multiple stacks.

This post covers best practices when working with shaders on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Shaders play a critical role in graphics programming by enabling you to control various aspects of the rendering process. They run on the GPU and are responsible for manipulating vertices, pixels, and other data.

Source

]]>
0
Chris Deotte https://www.kaggle.com/cdeotte <![CDATA[Pro Tips for Building Multilingual Recommender Systems]]> http://www.open-lab.net/blog/?p=69059 2023-08-24T18:03:44Z 2023-08-10T16:00:00Z Picture this: You're browsing through an online store, looking for the perfect pair of running shoes. But with thousands of options available, where do you even...]]> Picture this: You're browsing through an online store, looking for the perfect pair of running shoes. But with thousands of options available, where do you even...

Picture this: You��re browsing through an online store, looking for the perfect pair of running shoes. But with thousands of options available, where do you even begin? Suddenly, a section catches your eye: ��Recommended for You.�� Intrigued, you click and, within seconds, a curated list of running shoes tailored to your unique preferences appears. It��s as if the website understands your tastes��

Source

]]>
0
Tim Cheblokov <![CDATA[Advanced API Performance: Pipeline State Objects]]> http://www.open-lab.net/blog/?p=67779 2023-10-02T05:00:51Z 2023-07-18T19:00:00Z This post covers best practices when working with pipeline state objects on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see...]]> This post covers best practices when working with pipeline state objects on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see...A graphic of a computer sending code to multiple stacks.

This post covers best practices when working with pipeline state objects on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Pipeline state objects (PSOs) define how input data is interpreted and rendered by the hardware when submitting work to the GPUs. Proper management of PSOs is essential for optimal usage of system��

Source

]]>
0
Joel Lashmore <![CDATA[GPUs for ETL? Run Faster, Less Costly Workloads with NVIDIA RAPIDS Accelerator for Apache Spark and Databricks]]> http://www.open-lab.net/blog/?p=67503 2023-11-10T01:27:07Z 2023-07-17T18:08:30Z We were stuck. Really stuck. With a hard delivery deadline looming, our team needed to figure out how to process a complex extract-transform-load (ETL) job on...]]> We were stuck. Really stuck. With a hard delivery deadline looming, our team needed to figure out how to process a complex extract-transform-load (ETL) job on...Stylized image of a computer chip.

We were stuck. Really stuck. With a hard delivery deadline looming, our team needed to figure out how to process a complex extract-transform-load (ETL) job on trillions of point-of-sale transaction records in a few hours. The results of this job would feed a series of downstream machine learning (ML) models that would make critical retail assortment allocation decisions for a global retailer.

Source

]]>
0
Jay Rodge <![CDATA[Accelerated Data Analytics: Machine Learning with GPU-Accelerated Pandas and Scikit-learn]]> http://www.open-lab.net/blog/?p=67937 2024-05-15T16:11:39Z 2023-07-11T20:00:00Z If you are looking to take your machine learning (ML) projects to new levels of speed and scalability, GPU-accelerated data analytics can help you deliver...]]> If you are looking to take your machine learning (ML) projects to new levels of speed and scalability, GPU-accelerated data analytics can help you deliver...Decorative image.

If you are looking to take your machine learning (ML) projects to new levels of speed and scalability, GPU-accelerated data analytics can help you deliver insights quickly with breakthrough performance. From faster computation to efficient model training, GPUs bring many benefits to everyday ML tasks. Update: The below blog describes how to use GPU-only RAPIDS cuDF��

Source

]]>
0
Louis Bavoil <![CDATA[In-Game GPU Profiling for DirectX 12 Using SetBackgroundProcessingMode]]> http://www.open-lab.net/blog/?p=67605 2023-10-25T23:52:36Z 2023-07-10T17:00:00Z If you are a DirectX 12 (DX12) game developer, you may have noticed that GPU times displayed in real time in your game HUD may change over time for a given...]]> If you are a DirectX 12 (DX12) game developer, you may have noticed that GPU times displayed in real time in your game HUD may change over time for a given...

If you are a DirectX 12 (DX12) game developer, you may have noticed that GPU times displayed in real time in your game HUD may change over time for a given pass. This may be the case even if nothing has changed on the application side. One reason for GPU time variations may be GPU Boost dynamically changing the GPU core clock frequency. Still, even with GPU Boost disabled using the DX12��

Source

]]>
0
Joseph Cavanaugh <![CDATA[Advanced API Performance: CPUs]]> http://www.open-lab.net/blog/?p=64153 2023-10-02T05:00:51Z 2023-05-17T18:00:00Z This post covers CPU best practices when working with NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...]]> This post covers CPU best practices when working with NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...A graphic of a computer sending code to multiple stacks.

This post covers CPU best practices when working with NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. To get the best performance from your NVIDIA GPU, pair it with efficient work delegation on the CPU. Frame-rate caps, stutter, and other subpar application performance events can often be traced back to a bottleneck on the CPU.

Source

]]>
0
Yury Uralsky <![CDATA[Advanced API Performance: Sampler Feedback]]> http://www.open-lab.net/blog/?p=62908 2023-10-02T05:02:21Z 2023-05-04T17:11:42Z This post covers best practices for using sampler feedback on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...]]> This post covers best practices for using sampler feedback on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...A graphic of a computer sending code to multiple stacks.

This post covers best practices for using sampler feedback on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Sampler feedback is a DirectX 12 Ultimate feature for capturing and recording texture sampling information and locations. Sampler feedback was designed to provide better support for streaming and texture-space shading.

Source

]]>
0
Andr�� Franklin <![CDATA[Tips on Scaling Storage for AI Training and Inferencing]]> http://www.open-lab.net/blog/?p=60056 2023-07-27T19:52:33Z 2023-01-25T21:32:08Z There are many benefits of GPUs in scaling AI, ranging from faster model training to GPU-accelerated fraud detection. While planning AI models and deployed...]]> There are many benefits of GPUs in scaling AI, ranging from faster model training to GPU-accelerated fraud detection. While planning AI models and deployed...An image with stacked circular objects that grow with each iteration.

There are many benefits of GPUs in scaling AI, ranging from faster model training to GPU-accelerated fraud detection. While planning AI models and deployed apps, scalability challenges��especially performance and storage��must be accounted for. Regardless of the use case, AI solutions have four elements in common: Of these elements, data storage is often the most neglected during��

Source

]]>
1
Fatos Morina <![CDATA[Benefits of Using Pull Requests for Collaboration and Code Review]]> http://www.open-lab.net/blog/?p=57808 2023-07-27T19:54:05Z 2022-12-01T19:00:00Z Software teams comprise a broad range of professionals, from software engineers and data scientists to project managers and technical writers. Sharing code with...]]> Software teams comprise a broad range of professionals, from software engineers and data scientists to project managers and technical writers. Sharing code with...Laptop

Software teams comprise a broad range of professionals, from software engineers and data scientists to project managers and technical writers. Sharing code with other team members is common when working on a project, and it is important to track all changes. This is where pull requests come in. In software development, a pull request is used to push local changes into a shared repository��

Source

]]>
0
Richmond Alake <![CDATA[Data Storytelling Best Practices for Data Scientists and AI Practitioners]]> http://www.open-lab.net/blog/?p=56909 2023-07-27T19:54:47Z 2022-11-07T19:30:00Z Storytelling with data is a crucial soft skill for AI and data professionals. To ensure that stakeholders understand the technical requirements, value, and...]]> Storytelling with data is a crucial soft skill for AI and data professionals. To ensure that stakeholders understand the technical requirements, value, and...

Storytelling with data is a crucial soft skill for AI and data professionals. To ensure that stakeholders understand the technical requirements, value, and impact of data science team efforts, it is necessary for data scientists, data engineers, and machine learning (ML) engineers to communicate effectively. This post provides a framework and tips you can adopt to incorporate key elements of��

Source

]]>
1
Juha Sjoholm <![CDATA[Best Practices for Using NVIDIA RTX Ray Tracing (Updated)]]> http://www.open-lab.net/blog/?p=50632 2023-07-27T19:50:00Z 2022-07-25T20:00:00Z [stextbox id="info"]This post is an update of Best Practices: Using NVIDIA RTX Ray Tracing.[/stextbox] This post gathers best practices based on our experiences...]]> [stextbox id="info"]This post is an update of Best Practices: Using NVIDIA RTX Ray Tracing.[/stextbox] This post gathers best practices based on our experiences...

This post is an update of Best Practices: Using NVIDIA RTX Ray Tracing. This post gathers best practices based on our experiences so far using NVIDIA RTX ray tracing in games. The practical tips are organized into short, actionable items for developers working on ray tracing today. They aim to provide insight into what kind of solutions lead to good performance in most cases.

Source

]]>
0
Ana Mihut <![CDATA[Advanced API Performance: Vulkan Clearing and Presenting]]> http://www.open-lab.net/blog/?p=48112 2023-10-02T05:00:52Z 2022-07-01T15:09:39Z This post covers best practices for Vulkan clearing and presenting on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all...]]> This post covers best practices for Vulkan clearing and presenting on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all...A graphic of a computer sending code to multiple stacks.

This post covers best practices for Vulkan clearing and presenting on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. With the recent Vulkan 1.3 release, it��s timely to add some Vulkan-specific tips that are not necessarily explicitly covered by the other Advanced API Performance posts. In addition to introducing new Vulkan 1.3��

Source

]]>
1
Ryan Prescott <![CDATA[Advanced API Performance: SetStablePowerState]]> http://www.open-lab.net/blog/?p=48106 2024-08-28T17:45:35Z 2022-06-28T15:00:00Z This post covers best practices for using SetStablePowerState on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...]]> This post covers best practices for using SetStablePowerState on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...A graphic of a computer sending code to multiple stacks.

This post covers best practices for using SetStablePowerState on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Most modern processors, including GPUs, change processor core and memory clock rates during application execution. These changes can vary performance, introducing errors in measurements and rendering comparisons��

Source

]]>
14
Justin Kim <![CDATA[Advanced API Performance: Variable Rate Shading]]> http://www.open-lab.net/blog/?p=36325 2023-10-02T05:00:53Z 2022-05-16T21:42:00Z This post covers best practices for variable rate shading on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...]]> This post covers best practices for variable rate shading on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...A graphic of a computer sending code to multiple stacks.

This post covers best practices for variable rate shading on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Variable rate shading (VRS) is a graphics feature allowing applications to control the frequency of pixel shader invocations independent of the resolution of the render target. It is available in both D3D12 and Vulkan.

Source

]]>
1
Ivan Belyavtsev <![CDATA[Advanced API Performance: Clears]]> http://www.open-lab.net/blog/?p=34146 2023-10-02T05:00:53Z 2022-05-11T22:51:00Z This post covers best practices for clears on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips....]]> This post covers best practices for clears on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips....A graphic of a computer sending code to multiple stacks.

This post covers best practices for clears on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Surface clearing is a widely used accessory operation. Thanks to Michael Murphy, Maurice Harris, Dmitry Zhdan, and Patric Neil for their advice and feedback.

Source

]]>
1
Ana Mihut <![CDATA[Advanced API Performance: Mesh Shaders]]> http://www.open-lab.net/blog/?p=35887 2023-10-02T05:00:54Z 2021-10-25T16:10:00Z This post covers best practices for mesh shaders on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance...]]> This post covers best practices for mesh shaders on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance...A graphic of a computer sending code to multiple stacks.

This post covers best practices for mesh shaders on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Mesh shaders are a recent addition to the programmatical pipeline and aim to overcome the bottlenecks of the fixed layout used by the classical geometry pipeline. This post covers best practices for both DirectX and Vulkan��

Source

]]>
0
Andrew Allan <![CDATA[Advanced API Performance: Memory and Resources]]> http://www.open-lab.net/blog/?p=35933 2023-10-02T05:00:55Z 2021-10-25T16:05:00Z This post covers best practices for memory and resources on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...]]> This post covers best practices for memory and resources on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...A graphic of a computer sending code to multiple stacks.

This post covers best practices for memory and resources on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Optimal memory management in DirectX 12 is critical to a performant application. The following advice should be followed for the best performance while avoiding stuttering.

Source

]]>
1
Wessam Bahnassi <![CDATA[Advanced API Performance: Command Buffers]]> http://www.open-lab.net/blog/?p=34148 2023-10-02T05:00:55Z 2021-10-25T16:00:00Z This post covers best practices for command buffers on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...]]> This post covers best practices for command buffers on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...A graphic of a computer sending code to multiple stacks.

This post covers best practices for command buffers on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Command buffers are the main mechanism for sending commands from the CPU to be executed on the GPU. By following the best practices listed in this post, you can achieve performance gains on both the CPU and the GPU by��

Source

]]>
1
Jiho Choi <![CDATA[Advanced API Performance: Barriers]]> http://www.open-lab.net/blog/?p=33064 2023-10-02T05:00:56Z 2021-10-22T23:49:00Z This post covers best practices for barriers on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance...]]> This post covers best practices for barriers on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance...A graphic of a computer sending code to multiple stacks.

This post covers best practices for barriers on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. For the best performance on our hardware, here��s what you should and shouldn��t do when you��re using barriers with DX12 or Vulkan. This is updated from DX12 Do��s And Don��ts.

Source

]]>
1
Katherine Sun <![CDATA[Advanced API Performance: Async Copy]]> http://www.open-lab.net/blog/?p=33041 2023-10-02T05:00:56Z 2021-10-22T23:47:00Z This post covers best practices for async copy on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance...]]> This post covers best practices for async copy on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance...A graphic of a computer sending code to multiple stacks.

This post covers best practices for async copy on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. Async copy runs on completely independent hardware but you have to schedule it onto the separate queue. You can consider turning an async copy into an async compute as a performance strategy. NVIDIA has a dedicated async copy��

Source

]]>
2
Vladimir Bondarev <![CDATA[Advanced API Performance: Async Compute and Overlap]]> http://www.open-lab.net/blog/?p=33048 2023-10-02T05:00:57Z 2021-10-22T23:45:00Z This post covers best practices for async compute and overlap on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...]]> This post covers best practices for async compute and overlap on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API...A graphic of a computer sending code to multiple stacks.

This post covers best practices for async compute and overlap on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API Performance tips. The general principle behind async compute is to increase the overall unit throughput by reducing the number of unused warp slots and to facilitate the simultaneous use of nonconflicting datapaths.

Source

]]>
1
Amanda Saunders <![CDATA[Considerations for Deploying AI at the Edge]]> http://www.open-lab.net/blog/?p=37124 2023-07-27T19:55:35Z 2021-09-07T19:15:58Z The growth of edge computing has been a hot topic in many industries. The value of smart infrastructure can mean improvements to overall operational efficiency,...]]> The growth of edge computing has been a hot topic in many industries. The value of smart infrastructure can mean improvements to overall operational efficiency,...

The growth of edge computing has been a hot topic in many industries. The value of smart infrastructure can mean improvements to overall operational efficiency, safety, and even the bottom line. However, not all workloads need to be or even should be, deployed at the edge. Enterprises use a combination of edge computing and cloud computing when developing and deploying AI applications.

Source

]]>
0
Jiho Choi <![CDATA[Tips: Acceleration Structure Compaction]]> http://www.open-lab.net/blog/?p=31830 2023-07-27T19:56:16Z 2021-05-20T18:56:34Z In ray tracing, more geometries can reside in the GPU memory than with the rasterization approach because rays may hit the geometries out of the view frustum....]]> In ray tracing, more geometries can reside in the GPU memory than with the rasterization approach because rays may hit the geometries out of the view frustum....

In ray tracing, more geometries can reside in the GPU memory than with the rasterization approach because rays may hit the geometries out of the view frustum. You can let the GPU compact acceleration structures to save memory usage. For some games, compaction reduces the memory footprint for a bottom-level acceleration structure (BLAS) by at least 50%. BLASes usually take more GPU memory than top��

Source

]]>
1
Kazuki Onodera <![CDATA[Best Practices for Using AI to Develop the Most Accurate Retail Forecasting Solution]]> http://www.open-lab.net/blog/?p=25108 2024-10-28T18:31:55Z 2021-03-26T14:00:00Z A leading global retailer has invested heavily in becoming one of the most competitive technology companies around.  Accurate and...]]> A leading global retailer has invested heavily in becoming one of the most competitive technology companies around.  Accurate and...

A leading global retailer has invested heavily in becoming one of the most competitive technology companies around. Accurate and timely demand forecasting for millions of item-by-store combinations is critical to serving their millions of weekly customers. Key to their success in forecasting is RAPIDS, an open-source suite of GPU-accelerated libraries. RAPIDS helps them tear through their��

Source

]]>
0
Richard Cowgill <![CDATA[Tips: Getting the Most out of the DLSS Unreal Engine 4 Plugin]]> http://www.open-lab.net/blog/?p=24048 2023-10-25T23:53:06Z 2021-02-17T19:00:29Z DLSS is a deep learning, super-resolution network that boosts frame rates by rendering fewer pixels and then using AI to construct sharp, higher-resolution...]]> DLSS is a deep learning, super-resolution network that boosts frame rates by rendering fewer pixels and then using AI to construct sharp, higher-resolution...

DLSS is a deep learning, super-resolution network that boosts frame rates by rendering fewer pixels and then using AI to construct sharp, higher-resolution images. Dedicated computational units on NVIDIA RTX GPUs called Tensor Cores accelerate the AI calculations, allowing the algorithm to run in real time. DLSS pairs perfectly with computationally intensive rendering algorithms such as real-time��

Source

]]>
2
Juha Sjoholm <![CDATA[Best Practices: Using NVIDIA RTX Ray Tracing (Updated)]]> http://www.open-lab.net/blog/?p=19410 2023-07-27T19:50:29Z 2020-08-10T20:40:45Z [stextbox id="info"]This post has been updated: Best Practices for Using NVIDIA RTX Ray Tracing (Updated).[/stextbox] This post gathers best practices based on...]]> [stextbox id="info"]This post has been updated: Best Practices for Using NVIDIA RTX Ray Tracing (Updated).[/stextbox] This post gathers best practices based on...

This post has been updated: Best Practices for Using NVIDIA RTX Ray Tracing (Updated). This post gathers best practices based on our experiences so far on using NVIDIA RTX ray tracing in games. I��ve organized the tips into short, actionable items that give practical tips for developers working on ray tracing today. They aim to give a broad picture of what kind of solutions lead to good��

Source

]]>
0
Evan Hart <![CDATA[Tips and Tricks: Getting the Best Ray Tracing Performance Out of Unreal Engine 4.23]]> http://www.open-lab.net/blog/?p=15732 2023-10-25T23:54:12Z 2019-10-15T21:37:51Z Roughly five months ago, we introduced you to the new ray tracing support (via DirectX Raytracing) in the 4.22 release of Unreal Engine. Recently, Epic Games...]]> Roughly five months ago, we introduced you to the new ray tracing support (via DirectX Raytracing) in the 4.22 release of Unreal Engine. Recently, Epic Games...

Roughly five months ago, we introduced you to the new ray tracing support (via DirectX Raytracing) in the 4.22 release of Unreal Engine. Recently, Epic Games released version 4.23 which brings a number of upgrades for those working with ray tracing. Even better, many of these new and improved features, such as enhancements to performance, quality, and stability, require no direct user effort.

Source

]]>
0
Valerie Sarge <![CDATA[Tips for Optimizing GPU Performance Using Tensor Cores]]> http://www.open-lab.net/blog/?p=14687 2023-07-27T20:01:41Z 2019-06-10T13:00:06Z Our most popular question is "What can I do to get great GPU performance for deep learning?"?We��ve recently published a detailed Deep Learning Performance...]]> Our most popular question is "What can I do to get great GPU performance for deep learning?"?We��ve recently published a detailed Deep Learning Performance...

Our most popular question is ��What can I do to get great GPU performance for deep learning?�� We��ve recently published a detailed Deep Learning Performance Guide to help answer this question. The guide explains how GPUs process data and gives tips on how to design networks for better performance. We also take a close look at Tensor Core optimization to help improve performance. This post takes a��

Source

]]>
15
Nuno Subtil <![CDATA[Tips and Tricks: Vulkan Dos and Don��ts]]> http://www.open-lab.net/blog/?p=14696 2025-01-14T20:07:39Z 2019-06-06T17:14:24Z Note: This post was updated on 1/14/2025 to reflect updates. The increased performance potential of modern graphics APIs is coupled with a dramatically...]]> Note: This post was updated on 1/14/2025 to reflect updates. The increased performance potential of modern graphics APIs is coupled with a dramatically...

Note: This post was updated on 1/14/2025 to reflect updates. The increased performance potential of modern graphics APIs is coupled with a dramatically increased level of developer responsibility. Optimal use of Vulkan is not a trivial concept, especially in the context of a large engine, and information about how to maximize performance is still somewhat sparse.

Source

]]>
6
Alex Dunn <![CDATA[Tips and Tricks: Ray Tracing Best Practices]]> http://www.open-lab.net/blog/?p=14120 2023-07-27T20:03:06Z 2019-03-20T18:01:07Z This post presents best practices for implementing ray tracing in games and other real-time graphics applications. We present these as briefly as possible to...]]> This post presents best practices for implementing ray tracing in games and other real-time graphics applications. We present these as briefly as possible to...

This post presents best practices for implementing ray tracing in games and other real-time graphics applications. We present these as briefly as possible to help you quickly find key ideas. This is based on a presentation made at the 2019 GDC by NVIDIA engineers. 1.1 General Practices Move AS management (build/update) to an async compute queue. Using an async compute queue pairs��

Source

]]>
3
Cliff Woolley <![CDATA[CUDA Pro Tip: Improve NVIDIA Visual Profiler Loading of Large Profiles]]> http://www.open-lab.net/blog/parallelforall/?p=3213 2024-12-10T17:13:44Z 2014-05-06T21:03:51Z Post updated on December 10, 2024. NVIDIA has deprecated nvprof and NVIDIA Visual Profiler and these tools are not supported on current GPU architectures. The...]]> Post updated on December 10, 2024. NVIDIA has deprecated nvprof and NVIDIA Visual Profiler and these tools are not supported on current GPU architectures. The...GPU Pro Tip

Post updated on December 10, 2024. NVIDIA has deprecated nvprof and NVIDIA Visual Profiler and these tools are not supported on current GPU architectures. The original post still applies to previous GPU architectures, up to and including Volta. For Volta and newer architectures, profile your applications with NVIDIA Nsight Compute and NVIDIA Nsight Systems. For more information about how to��

Source

]]>
4
Jiri Kraus <![CDATA[CUDA Pro Tip: Generate Custom Application Profile Timelines with NVTX]]> http://www.open-lab.net/blog/parallelforall/?p=2003 2024-08-12T15:49:35Z 2013-09-04T01:49:42Z The last time you used the timeline feature in the NVIDIA Visual Profiler, Nsight VSE or the new Nsight Systems to analyze a complex application, you might have...]]> The last time you used the timeline feature in the NVIDIA Visual Profiler, Nsight VSE or the new Nsight Systems to analyze a complex application, you might have...GPU Pro Tip

The last time you used the timeline feature in the NVIDIA Visual Profiler, Nsight VSE or the new Nsight Systems to analyze a complex application, you might have wished to see a bit more than just CUDA API calls and GPU kernels. In this post I will show you how you can use the NVIDIA Tools Extension (NVTX) to annotate the time line with useful information. I will demonstrate how to add time��

Source

]]>
6
���˳���97caoporen����