使用 NVIDIA NIM 構建您的人工在環 AI 智能體首秀

由大語言模型（LLMs）提供支持的 AI 智能體可幫助組織簡化和減少手動工作負載。這些智能體使用多層迭代推理來分析問題、設計解決方案，并使用各種工具執行任務。與傳統聊天機器人不同，LLM 提供支持的智能體能夠有效理解和處理信息，從而實現復雜任務的自動化。為避免特定應用中的潛在風險，在使用自主 AI 智能體時，保持人工監督仍然至關重要。

在本文中，您將學習如何使用 NVIDIA NIM 微服務（一種針對 AI 推理優化的加速 API）構建人類在環 AI 智能體。該博文介紹了一個社交媒體用例，展示了這些多功能 AI 智能體如何輕松處理復雜任務。借助 NIM 微服務，您可以將高級 LLM（如 Llama 3.1-70B-Instruct 和 Falcon 180B 等）無縫集成到工作流中，從而提供 AI 驅動任務所需的可擴展性和靈活性。無論您是使用 PyTorch、pandas 和 LangChain 等工具創建宣傳內容，還是實現復雜工作流程的自動化，本教程都旨在加速您的流程。

如需觀看演示，請觀看如何使用 NVIDIA NIM 在 5 分鐘內構建簡單的 AI 代理。

為個性化社交媒體內容構建 AI 智能體

如今，營銷人員面臨的最大挑戰之一是跨平臺生成高質量的創意推廣內容。我們的目標是創建可在社交媒體上發布的各種促銷信息和藝術作品。

傳統上，項目負責人會將這些任務分配給內容編寫人員和數字藝術家等專家。但是，如果人工智能智能體有助于提高此流程的效率，該怎么辦？

此用例涉及兩個 AI 智能體——Content Creator 智能體和 Digital Artist 智能體。這些 AI 智能體將生成宣傳內容并提交給人類決策者進行最終審批，從而確保人類控制仍然是創意流程的核心。

視頻 1：觀看營銷人員如何在 AI 客服的支持下撰寫社交媒體帖子

構建人機決策工作流程

構建這種人在環系統需要創建認知工作流，其中 AI 智能體協助完成特定任務，而人類執行最終決策。圖 1 概述了人類決策者與智能體之間的交互。

Diagram showing the interaction flow between a human decision maker and the AI agents. — *圖 1. 人機交互概念架構*

內容創作者代理使用 Llama 3.1 405B 模型，該模型由 NVIDIA LLM NIM 微服務加速。此外，還集成了 LangChain ChatNVIDIA 與 NIM 函數調用和結構化輸出，以確保結果井然有序、可靠。ChatNVIDIA 是 NVIDIA 為 LangChain 提供的開源 Python 庫，可讓開發者輕松連接 NVIDIA NIM。這些組合功能被整合到 LangChain 可運行鏈（LCEL）表達式中，從而創建穩健的智能體工作流。

構建 Content Creator Agent?

首先構建 Content Creator Agent。此智能體使用 NVIDIA API Catalog 預覽 API 端點，按照特定格式指南生成促銷消息。 NVIDIA AI Enterprise 客戶還可以在本地下載并運行 NIM 端點。

使用以下 Python 代碼開始使用：

from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain import prompts, chat_models, hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field, validator
from typing import Optional, List
 
 
## 1. construct the system prompt ---------
prompt_template = """
### [INST]
 
 
You are an expert social media content creator.
Your task is to create a different promotion message with the following 
Product Description :
------
{product_desc}
------
The output promotion message MUST use the following format :
'''
Title: a powerful, short message that dipict what this product is about 
Message: be creative for the promotion message, but make it short and ready for social media feeds.
Tags: the hash tag human will nomally use in social media
'''
Begin!
[/INST]
 """
prompt = PromptTemplate(
input_variables=['produce_desc'],
template=prompt_template,
)
 
 
## 2. provide seeded product_desc text
product_desc="Explore the latest community-built AI models with an API optimized and accelerated by NVIDIA, then deploy anywhere with NVIDIA NIM? inference microservices."
 
 
## 3. structural output using LMFE 
class StructureOutput(BaseModel):     
    Title: str = Field(description="Title of the promotion message")
    Message : str = Field(description="The actual promotion message")
    Tags: List[str] = Field(description="Hashtags for social media, usually starts with #")
 
 
## 4. A powerful LLM 
llm_with_output_structure=ChatNVIDIA(model="meta/llama-3.1-405b-instruct").with_structured_output(StructureOutput)     
 
 
## construct the content_creator agent
content_creator = ( prompt | llm_with_output_structure )
out=content_creator.invoke({"product_desc":product_desc})

使用數字藝術家智能體?

接下來，我們介紹 Digital Artist Agent，它使用 NVIDIA sdXL-turbo 文本轉圖像模型將宣傳文本轉換為創意視覺效果。此代理重寫輸入查詢并生成專為社交媒體推廣活動設計的高質量圖像。以下代碼提供了代理如何集成的示例：

import requests
import base64, io
from PIL import Image
import requests, json
def generate_image(prompt :str) -> str :
    """
    generate image from text
    Args:
        prompt: input text
    """
    ## re-writing the input promotion title in to appropriate image_gen prompt 
    gen_prompt=llm_rewrite_to_image_prompts(prompt)
    print("start generating image with llm re-write prompt:", gen_prompt)
    invoke_url = "https://ai.api.nvidia.com/v1/genai/stabilityai/sdxl-turbo"
     
    headers = {
        "Authorization": f"Bearer {nvapi_key}",
        "Accept": "application/json",
    }
     
    payload = {
        "text_prompts": [{"text": gen_prompt}],
        "seed": 0,
        "sampler": "K_EULER_ANCESTRAL",
        "steps": 2
    }
     
    response = requests.post(invoke_url, headers=headers, json=payload)
     
    response.raise_for_status()
    response_body = response.json()
    ## load back to numpy array 
    print(response_body['artifacts'][0].keys())
    imgdata = base64.b64decode(response_body["artifacts"][0]["base64"])
    filename = 'output.jpg'
    with open(filename, 'wb') as f:
        f.write(imgdata)   
    im = Image.open(filename)  
    img_location=f"the output of the generated image will be stored in this path : {filename}"
    return img_location

使用以下 Python 腳本將用戶輸入的查詢重寫為圖像生成提示：

from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain import prompts, chat_models, hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
 
 
def llm_rewrite_to_image_prompts(user_query):
    prompt = prompts.ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "Summarize the following user query into a very short, one-sentence theme for image generation, MUST follow this format : A iconic, futuristic image of , no text, no amputation, no face, bright, vibrant",
            ),
            ("user", "{input}"),
        ]
    )
    model = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1")
    chain = ( prompt    | model   | StrOutputParser() )
    out= chain.invoke({"input":user_query})
    #print(type(out))
    return out}

接下來，將圖像生成與所選 LLM 綁定，并將其包裝在 LCEL 中，以創建 Digital Artist Agent：

## bind image generation as tool into llama3.1-405b llm
llm=ChatNVIDIA(model="meta/llama-3.1-405b-instruct")
llm_with_img_gen_tool=llm.bind_tools([generate_image],tool_choice="generate_image")
## use LCEL to construct Digital Artist Agent
digital_artist = (
    llm_with_img_gen_tool
    | output_to_invoke_tools
)

將人在環與決策者的角色集成

為了保持人工監督，代理將分享他們的輸出，以供最終批準。人類決策者將審核 Content Creator Agent 生成的文本和 Digital Artist Agent 制作的藝術作品。

這種交互允許進行多次迭代，確保促銷消息和圖像經過優化并可隨時部署。

代理邏輯以人類為中心作為決策者，為每個任務分配合適的代理。 LangGraph 用于編排代理認知架構。

其中涉及一個請求人類輸入的函數:

# Or you can directly instantiate the tool
from langchain_community.tools import HumanInputRun
from langchain.agents import AgentType, load_tools
from langchain.agents import AgentType, initialize_agent, load_tools
 
 
def get_human_input() -> str:
    """ Put human as decision maker, human will decide which agent is best for the task"""
    print("You have been given 2 agents. Please select exactly _ONE_ agent to help you with the task, enter 'y' to confirm your choice.")
    print("""Available agents are : \n
            1 ContentCreator  \n
            2 DigitalArtist \n          
            Enter 1 or 2""")
    contents = []
    while True:
        try:            
            line = input()
            if line=='1':
                tool="ContentCreator"               
                line=tool                
            elif line=='2':
                tool="DigitalArtist"               
                line=tool                
            else:
                pass
             
        except EOFError:
            break
        if line == "y":
            print(f"tool selected : {tool} ")
            break
        contents.append(line)       
    return "\n".join(contents)
 
 
# You can modify the tool when loading
 
 
ask_human = HumanInputRun(input_func=get_human_input)

接下來，創建兩個額外的 Python 函數作為圖形節點，LangGraph 使用這些函數表示工作流中的步驟或操作。這些節點使智能體能夠依次或并行執行特定任務，從而創建靈活的結構化流程：

from langgraph.graph import END, StateGraph
from langgraph.prebuilt import ToolInvocation
from colorama  import Fore,Style
# Define the functions needed 
def human_assign_to_agent(state):
    # ensure using original prompt 
    inputs = state["input"]
    input_to_agent = state["input_to_agent"]
    concatenate_str = Fore.BLUE+inputs+ ' : '+Fore.CYAN+input_to_agent + Fore.RESET
    print(concatenate_str)
    print("---"*10)  
    agent_choice=ask_human.invoke(concatenate_str)
    print(Fore.CYAN+ "choosen_agent : " + agent_choice + Fore.RESET)
    return {"agent_choice": agent_choice }
 
 
def agent_execute_task(state):    
    inputs= state["input"]
    input_to_agent = state["input_to_agent"]
    print(Fore.CYAN+input_to_agent + Fore.RESET)
    # choosen agent will execute the task
    choosen_agent = state['agent_choice']
    if choosen_agent=='ContentCreator':
        structured_respond=content_creator.invoke({"product_desc":input_to_agent})
        respond='\n'.join([structured_respond.Title,structured_respond.Message,''.join(structured_respond.Tags)])       
    elif choosen_agent=="DigitalArtist":
        respond=digital_artist.invoke(input_to_agent)
    else:
        respond="please reselect the agent, there are only 2 agents available: 1.ContentCreator or 2.DigitalArtist"
     
    print(Fore.CYAN+ "agent_output: \n" + respond + Fore.RESET)
    return {"agent_use_tool_respond": respond} 

最后，通過連接節點和邊緣，將所有元素融合在一起，形成人類在環多智能體工作流程。一旦圖形編譯完成，您就可以繼續：

from langgraph.graph import END, StateGraph
 
 
# Define a new graph
workflow = StateGraph(State)
 
 
# Define the two nodes 
workflow.add_node("start", human_assign_to_agent)
workflow.add_node("end", agent_execute_task)
 
 
# This means that this node is the first one called
workflow.set_entry_point("start")
workflow.add_edge("start", "end")
workflow.add_edge("end", END)
 
 
# Finally, we compile it!
# This compiles it into a LangChain Runnable,
# meaning you can use it as you would any other runnable
app = workflow.compile()

啟動人工智能工作流程?

現在，啟動應用。它會提示您為給定任務分配一個可用代理。

撰寫宣傳文本的提示?

首先，查詢 Content Creator Agent 以編寫促銷文本，包括標題、消息和社交媒體話題標簽（圖 2）。重復此操作，直到對輸出感到滿意。

Flow diagram illustrating human assigning Content Creator Agent to create promotion text. — *圖 2. 人類向內容創建者代理（Content Creator Agent）發出查詢，生成社交媒體推廣文本*

Python 代碼示例：

my_query="create a good promotional message for social promotion events using the following inputs"
product_desc="NVIDIA NIM microservices power GenAI workflow"
respond=app.invoke({"input":my_query, "input_to_agent":product_desc})

人類為該任務選擇 1 個 Content Creator Agent。代理執行并返回 agent_output，如圖 3 所示。

Screenshot of a sample response from Content Creator Agent. — 圖 3、調用 Content Creator Agent 管道的輸出示例。

創建插圖的提示?

對結果感到滿意后，請繼續查詢數字藝術家代理（Digital Artist Agent），以創建用于社交媒體推廣的藝術作品（圖 4）。

Flow chart showing human assigning the Digital Artist Agent to create artwork. — *圖 4. 人類指派數字藝術代理生成社交媒體藝術作品*

以下 Python 代碼示例使用 Content Creator Agent 生成的標題作為圖像提示的輸入：

## taken the output from the Title from the output of Content Creator Agent 
prompt_for_image=respond['agent_use_tool_respond'].split('\n')[0].split(':')[-1].strip()
## Human decision maker give instruction to the agent workflow app
input_query="generate an image for me from the below promotion message"
respond2=app.invoke({"input":input_query, "input_to_agent":prompt_for_image})

生成的圖像另存為 output.jpg。

Screenshot of a sample response from Digital Artist Agent. — 圖 5. 調用 Digital Artist Agent 管道的輸出示例。

迭代以獲得高質量結果?

您可以對生成的圖像進行迭代，以獲取不同的藝術作品變體，從而獲得想要的結果（圖 6）。通過 Content Creator Agent 略微調整輸入提示，可以從 Digital Artist Agent 生成各種圖像。

Three sample images generated by the Digital Artist Agent. — *圖 6、* 由 Digital Artist Agent 生成的圖像示例。

提煉最終產品?

最后，執行后處理并優化兩個智能體的組合輸出，并以 Markdown 格式對其進行格式化，以便進行最終視覺審查（圖 7）。

An image of a robot head and shoulders with a social media post entitled ‘Unlock Next-Gen AI Innovation.' — *圖 7、由 AI 智能體生成并經人類批準的社交媒體帖子和附帶圖像的后處理輸出*

使用 NVIDIA NIM 微服務和 AI 工具增強您的 AI 智能體

在本博文中，您已學習如何使用 NVIDIA NIM 微服務和 LangChain 的 LangGraph 構建人類在環 AI 智能體，以簡化內容創作工作流程。通過將 AI 智能體融入工作流程，您可以加速內容制作、減少手動工作，并完全控制創意流程。

NVIDIA NIM 微服務使您能夠高效靈活地擴展人工智能驅動的任務。無論您是在制作促銷信息還是設計視覺效果，人工智能智能體都能提供強大的解決方案來優化工作流程并提高工作效率。

通過以下其他資源了解詳情：

使用 NVIDIA NIM 微服務和 LangChain 構建 AI 智能體（注：LangChain 是一個開源項目，用于構建大型語言模型的微服務。NVIDIA NIM 是一個用于構建 AI 智能體的微服務框架。）使用 NVIDIA NIM 微服務和 LangChain 構建 AI 智能體，可以更好地利用大型語言模型的能力，例如 Llama 3.1-70B-Instruct 和 Falcon 180B 等。此外，使用 NVIDIA NIM 微服務和 LangChain 還可以更好地支持多種 AI 框架，例如 PyTorch 和 pandas 等。
使用 LangGraph 在代理邏輯中整合人在環
Notebook 的環境構建
Llama 3.1 405B Instruct NIM 微服務

使用 NVIDIA NIM 構建您的人工在環 AI 智能體首秀

為個性化社交媒體內容構建 AI 智能體

構建人機決策工作流程

構建 Content Creator Agent?

使用數字藝術家智能體?

將人在環與決策者的角色集成

啟動人工智能工作流程?

撰寫宣傳文本的提示?

創建插圖的提示?

迭代以獲得高質量結果?

提煉最終產品?

使用 NVIDIA NIM 微服務和 AI 工具增強您的 AI 智能體

相關資源

標簽

關于作者

使用 NVIDIA NIM 構建您的人工在環 AI 智能體首秀

為個性化社交媒體內容構建 AI 智能體

構建人機決策工作流程

構建 Content Creator Agent?

使用數字藝術家智能體?

將人在環與決策者的角色集成

啟動人工智能工作流程?

撰寫宣傳文本的提示?

創建插圖的提示?

迭代以獲得高質量結果?

提煉最終產品?

使用 NVIDIA NIM 微服務和 AI 工具增強您的 AI 智能體

相關資源

標簽

關于作者

相關文章

使用 NVIDIA NIM 構建基于 VLM 的簡單多模態信息檢索系統

使用 NVIDIA NIM 和 LangChain 創建自定義 Slackbot LLM 智能體

相關文章

在 NVIDIA NeMo 框架的首發日支持下即時運行 Hugging Face 模型

在 Azure AI Foundry 上使用 NVIDIA NIM 加速 AI 推理

應用具有推理能力的專用大語言模型（LLM）加速電池研究

擴展 NVIDIA Agent Intelligence Toolkit 以支持新的代理式框架

借助 3DGUT 在 gsplat 中革新神經重建和渲染