Configurable Graph-Based Task Solving with the Marco Multi-AI Agent Framework for Chip Design

Chip and hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around time (TAT) for optimizing performance, power, area, and cost (PPAC) during synthesis, verification, physical design, and reliability loops.

Large language models (LLMs) have shown a remarkable capacity to comprehend and generate natural language at a massive scale, leading to many potential applications and benefits across various domains. Successful LLM-based AI agents for hardware design can drastically reduce TAT, leading to faster product cycles, lower costs, improved design reliability, and reduced risk of costly errors.

Three diagrams show (a) graph-based task solving both dynamic and static; (b) single-AI and multi-AI configurations of sub-task nodes; and (c) agent memory, knowledge database, and tool configurations. — *Figure 1. Marco framework overview*

Marco: Configurable Graph-Based Task Solving and Multi-AI Agents Framework

We introduce the proposed Marco framework, which encompasses graph-based task solving, agent configurations for sub-tasks, and skill/tool configurations for each AI agent in real time.

Figure 1 showed dynamic and static configurable graph-based task solving, which is flexibly integrated with chip-design knowledge (for example, circuits, timing, and so on).

In the task graph, each node represents a sub-task, and each edge represents the execution or knowledge relationship between nodes. For solving each sub-task, you use Autogen to configure single-AI or multi-AI agent with a knowledge database, tools, and memory.

Table 1 summarizes the task graph, agent, and LLM configurations of the Marco framework for various agents. VerilogCoder and the MCMM timing analysis agent use a dynamic task graph to complete the specification-to-RTL and extract key takeaways of timing reports, respectively. The timing path debug agent finds the problematic net, wire, and constraints through a static timing debugging task graph (Figure 1).

Agent works	Task category	Marco framework configuration
Agent works	Task category	Task graph	Sub-task agent config.	Customized tools
RTLFixer	Code Syntax Fixing	N/A	Single-AI	RTL Syntax Error RAG Database
Standard Cell Layout Opt.	Optimization	N/A	Single-AI	Cluster Evaluator, Netlist Traverse Tool
MCMM Timing Analysis(Partition/Block-Level)	Summary & Anomaly Identification	Dynamic	Multi-AI	Timing Distribution Calculator, Timing Metric Comparator
DRC Coder	Code Generation	N/A	Multi-Modality & Multi-AI	Foundry Rule Analysis, Layout DRV Analysis, DRC Code Evaluation
Timing Path Debug(Path-Level)	Summary & Anomaly Identification	Static	Hierarchical Multi-AI	Agentic Timing Report Retrieval
VerilogCoder	Code Generation	Dynamic	Multi-AI	TCRG Retrieval Tool, AST-Based Waveform Tracing Tool

Table 1. Task graph, agent configuration, customized tool of the Marco framework for various autonomous agent implementations for hardware design tasks

In Table 1, RTLFixer and Standard Cell Layout Optimization agents (that is, a single-AI configuration) are supported using the Marco framework. The rest of the multi-AI agents are implemented on the Marco framework.

For RTLFixer, LLM agent for standard cell layout optimization, and DRC Coder, we used single-AI or multi-AI agent configurations with customized tools, memory, and domain knowledge.

Automated hardware description languages code generation

One key area where autonomous agents are making an impact is in the generation of hardware description languages (HDLs), such as Verilog. Due to the growing complexity of VLSI design, writing Verilog and VHDL is time-consuming and prone to bugs, necessitating multiple iterations for debugging functional correctness. Consequently, reducing design costs and designer effort for completing hardware specifications has emerged as a critical need.

LLMs can be used to generate Verilog code from natural language descriptions. However, LLMs often struggle to produce code that is both syntactically and functionally correct.

Syntax correctness

RTLFixer uses a combination of retrieval-augmented generation (RAG) and ReAct prompting to enable LLMs to iteratively debug and fix syntax errors. RAG incorporates a database of human expert guidance to provide context for error correction. ReAct enables the LLM to reason about the error, plan a fix, and act on the plan.

Functional correctness

VerilogCoder is a multi-agent system that incorporates a task planner and an abstract syntax tree (AST)-based waveform-tracing tool to generate and debug Verilog code. It employs a task and circuit relation graph (TCRG) to break down a task into manageable sub-tasks and link signal transitions to each step (Figure 2).

The diagram shows a task-driven circuit relation graph retrieval agent retrieving the signal, signal transition, and signal example detail information through reasoning on task-driven circuit relation graph. — *Figure 2. Task-driven circuit relation graph retrieval agent reasoning and interacting with the developed TCRG retrieval tool to enrich the task with the relevant circuit and signal descriptions*

An AST-based waveform tracing tool assists the LLM agent in identifying and fixing functional errors by back-tracing signal waveforms. VerilogCoder achieves a 94.2% success rate on the VerilogEval-Human v2 benchmark, demonstrating a significant improvement over previous methods.

Video 1. Autonomously Complete Verilog Code with TCRG Planning and AST-Based Waveform Tracing Tools

In Video 1, the demonstration of VerilogCoder autonomously completes functionally correct Verilog code using TCRG planning and AST-based waveform tracing tools.

Automated DRC layout code generation

DRC-Coder uses multiple autonomous agents with vision capabilities and specialized DRC and Layout DRV analysis tools to generate DRC code. The system interprets design rules from textual descriptions, visual illustrations, and layout representations. The multiple LLM agents include a planner that interprets design rules, and a programmer that translates the rules into executable code.

DRC-Coder incorporates an auto-debugging process, which uses feedback from the code evaluation to refine the generated code.

Video 2. A Demonstration of DRC-Coder in Chip Design

In Video 2, the demonstration of DRC-Coder generates DRC code that achieves perfect F1 scores on hundreds of testing layouts by leveraging a layout analysis tool, an auto-debugging process, and the capabilities of multi-modality and multi-AI agents.

DRC-Coder achieved a perfect F1 score of 1.000 in generating DRC codes for a sub-3nm technology node, outperforming standard prompting techniques. The proposed automated agentic approach significantly reduces the time required for DRC code generation, from weeks to an average of four minutes per design rule.

Standard cell layout optimization

LLM agent for standard cell layout optimization proposes using the natural language and reasoning ability of an LLM to generate high-quality cluster constraints incrementally to optimize the cell layout PPA and debug the routability with ReAct prompting.

The system uses net information and cell layout analysis to group MOSFET devices into clusters. The AI agent not only achieves up to 19.4% smaller cell area, but also generates 23.5% more LVS and DRC clean cell layouts than the Transformer-based device clustering approach on a set of sequential cells in the industrial 2nm technology node.

Multi-corner multi-mode timing report debug and analysis

The multi-corner multi-mode (MCMM) timing analysis agent uses dynamic task graphs to complete the specification-to-RTL and extract key takeaways of timing reports, respectively.

The MCMM timing analysis agent achieves an average score of 8.33 out of 10, based on evaluations by experienced engineers on a set of industrial cases, and delivers approximately 60x speedups compared to human engineers (Figure 3).

The bar chart shows that the MCMM timing analysis agent achieves a 60x speedup compared to experienced human engineers. — *Figure 3. MCMM timing analysis agent results*

The timing path debug agent finds the problematic net, wire, and constraints through the static timing debugging task graph (Figure 1).

In Table 2, the timing path debug agent resolves 86% of path-level debugging tasks, whereas the standard task solving approach fails to resolve any of the tasks.

Multi Report Task Description	Required AnalyzedSub-Tasks	Standard Task Solving	Timing Path Debug Agent
Find missing clk signals that have no rise/fall information	max, clk	X	V
Identify pairs of nets with high RC mismatch	max, wire	X	V
Detect unusual constraints between victim and its aggressors	max, xtalk, LC	X	V
Identify unusual RC values between victim and its aggressors	max, wire, xtalk, LC	X	V
Find the constraints of slowest stages with highest RC values	max, wire, xtalk, LC	X	V
Compare each timing table for number of stages, point values and timing mismatch	max	X	X
Task M2 and Task M3 for specific stages in list of paths	max, wire, xtalk, LC	X	V
Avg Pass-rate		0%	86%

Table 2. Pass-rate (%) of timing path debug agent with static task graph solving, and a na?ve standard task solving without task graph information

X=Failed to solve the task. V=Solved the task successfully.

Conclusion

The proposed Marco framework enables more flexible and domain-specialized methods for real-time hardware design tasks solving. By using task graph and flexible single-AI and multi-AI agent configurations with domain-specific tools and knowledge, we developed various agents for tasks such as cell layout optimization, Verilog syntax error fixing, Verilog and DRC code generation, and timing debugging on problematic blocks, nets, and wires.

The experimental results show impressive performance and efficiency benefits on utilizing collaborative LLM-based agents for chip design.

The future directions for agent research on hardware design include the following:

Training LLMs with high-quality hardware design data
Improving LLM-based agents’ ability for hardware signal and waveform debugging
Incorporating PPA metrics into the design flow
Developing more efficient self-learning techniques and memory systems for LLM agents for solving more complex hardware tasks

For more papers and projects on electronic design automation, see the NVIDIA Design Automation Research Group page.

For those interested in the technologies highlighted in the post, here’s a list of relevant papers: