Zhilin Wang – NVIDIA Technical Blog

Zhilin Wang – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-05T16:01:49Z http://www.open-lab.net/blog/feed/ Zhilin Wang <![CDATA[Build Enterprise AI Agents with Advanced Open NVIDIA Llama Nemotron Reasoning Models]]> http://www.open-lab.net/blog/?p=97155 2025-05-05T16:01:49Z 2025-04-08T22:05:00Z

This updated post was originally published on March 18, 2025. Organizations are embracing AI agents to enhance productivity and streamline operations. To...]]>

This updated post was originally published on March 18, 2025. Organizations are embracing AI agents to enhance productivity and streamline operations. To maximize their impact, these agents need strong reasoning abilities to navigate complex problems, uncover hidden connections, and make logical decisions autonomously in dynamic environments. Due to their ability to tackle complex…

]]> Zhilin Wang <![CDATA[New Reward Model Helps Improve LLM Alignment with Human Preferences]]> http://www.open-lab.net/blog/?p=89655 2024-10-21T23:56:04Z 2024-10-03T16:00:00Z

Reinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the...]]>

Reinforcement learning from human feedback (RLHF) is essential for developing AI systems that are aligned with human values and preferences. RLHF enables the most capable LLMs, including ChatGPT, Claude, and Nemotron families, to generate exceptional responses. By integrating human feedback into the training process, RLHF enables models to learn more nuanced behaviors and make decisions that…

]]> Zhilin Wang <![CDATA[Announcing HelpSteer: An Open-Source Dataset for Building Helpful LLMs]]> http://www.open-lab.net/blog/?p=73937 2024-01-03T23:48:02Z 2023-11-27T17:00:00Z

NVIDIA recently announced the NVIDIA NeMo SteerLM technique as part of the NVIDIA NeMo framework. This technique enables users to control large language model...]]>

NVIDIA recently announced the NVIDIA NeMo SteerLM technique as part of the NVIDIA NeMo framework. This technique enables users to control large language model (LLM) responses during inference. The developer community has shown great interest in using the approach for building custom LLMs. The NVIDIA NeMo team is now open-sourcing a multi-attribute dataset called Helpfulness SteerLM dataset…

]]> 0 Zhilin Wang <![CDATA[Announcing NVIDIA SteerLM: A Simple and Practical Technique to Customize LLMs During Inference]]> http://www.open-lab.net/blog/?p=68954 2024-05-02T16:47:04Z 2023-10-11T14:30:00Z

With the advent of large language models (LLMs) such as GPT-3, Megatron-Turing, Chinchilla, PaLM-2, Falcon, and Llama 2, remarkable progress in natural language...]]>

With the advent of large language models (LLMs) such as GPT-3, Megatron-Turing, Chinchilla, PaLM-2, Falcon, and Llama 2, remarkable progress in natural language generation has been made in recent years. However, despite their ability to produce human-like text, ‌foundation LLMs can fail to provide helpful and nuanced responses aligned with user preferences. The current approach to improving…

]]> 0 ��˳��97caoporen��