Chip and hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around time (TAT) for optimizing performance, power, area, and cost (PPAC) during synthesis, verification, physical design, and reliability loops.
Large language models (LLMs) have shown a remarkable capacity to comprehend and generate natural language at a massive scale, leading to many potential applications and benefits across various domains. Successful LLM-based AI agents for hardware design can drastically reduce TAT, leading to faster product cycles, lower costs, improved design reliability, and reduced risk of costly errors.

Marco: Configurable Graph-Based Task Solving and Multi-AI Agents Framework
We introduce the proposed Marco framework, which encompasses graph-based task solving, agent configurations for sub-tasks, and skill/tool configurations for each AI agent in real time.
Figure 1 showed dynamic and static configurable graph-based task solving, which is flexibly integrated with chip-design knowledge (for example, circuits, timing, and so on).
In the task graph, each node represents a sub-task, and each edge represents the execution or knowledge relationship between nodes. For solving each sub-task, you use Autogen to configure single-AI or multi-AI agent with a knowledge database, tools, and memory.
Table 1 summarizes the task graph, agent, and LLM configurations of the Marco framework for various agents. VerilogCoder and the MCMM timing analysis agent use a dynamic task graph to complete the specification-to-RTL and extract key takeaways of timing reports, respectively. The timing path debug agent finds the problematic net, wire, and constraints through a static timing debugging task graph (Figure 1).
Agent works | Task category | Marco framework configuration | ||
Task graph | Sub-task agent config. | Customized tools | ||
RTLFixer | Code Syntax Fixing | N/A | Single-AI | RTL Syntax Error RAG Database |
Standard Cell Layout Opt. | Optimization | N/A | Single-AI | Cluster Evaluator, Netlist Traverse Tool |
MCMM Timing Analysis(Partition/Block-Level) | Summary & Anomaly Identification | Dynamic | Multi-AI | Timing Distribution Calculator, Timing Metric Comparator |
DRC Coder | Code Generation | N/A | Multi-Modality & Multi-AI | Foundry Rule Analysis, Layout DRV Analysis, DRC Code Evaluation |
Timing Path Debug(Path-Level) | Summary & Anomaly Identification | Static | Hierarchical Multi-AI | Agentic Timing Report Retrieval |
VerilogCoder | Code Generation | Dynamic | Multi-AI | TCRG Retrieval Tool, AST-Based Waveform Tracing Tool |
In Table 1, RTLFixer and Standard Cell Layout Optimization agents (that is, a single-AI configuration) are supported using the Marco framework. The rest of the multi-AI agents are implemented on the Marco framework.
For RTLFixer, LLM agent for standard cell layout optimization, and DRC Coder, we used single-AI or multi-AI agent configurations with customized tools, memory, and domain knowledge.
Automated hardware description languages code generation
One key area where autonomous agents are making an impact is in the generation of hardware description languages (HDLs), such as Verilog. Due to the growing complexity of VLSI design, writing Verilog and VHDL is time-consuming and prone to bugs, necessitating multiple iterations for debugging functional correctness. Consequently, reducing design costs and designer effort for completing hardware specifications has emerged as a critical need.
LLMs can be used to generate Verilog code from natural language descriptions. However, LLMs often struggle to produce code that is both syntactically and functionally correct.
Syntax correctness
RTLFixer uses a combination of retrieval-augmented generation (RAG) and ReAct prompting to enable LLMs to iteratively debug and fix syntax errors. RAG incorporates a database of human expert guidance to provide context for error correction. ReAct enables the LLM to reason about the error, plan a fix, and act on the plan.
Functional correctness
VerilogCoder is a multi-agent system that incorporates a task planner and an abstract syntax tree (AST)-based waveform-tracing tool to generate and debug Verilog code. It employs a task and circuit relation graph (TCRG) to break down a task into manageable sub-tasks and link signal transitions to each step (Figure 2).

An AST-based waveform tracing tool assists the LLM agent in identifying and fixing functional errors by back-tracing signal waveforms. VerilogCoder achieves a 94.2% success rate on the VerilogEval-Human v2 benchmark, demonstrating a significant improvement over previous methods.
In Video 1, the demonstration of VerilogCoder autonomously completes functionally correct Verilog code using TCRG planning and AST-based waveform tracing tools.
Automated DRC layout code generation
DRC-Coder uses multiple autonomous agents with vision capabilities and specialized DRC and Layout DRV analysis tools to generate DRC code. The system interprets design rules from textual descriptions, visual illustrations, and layout representations. The multiple LLM agents include a planner that interprets design rules, and a programmer that translates the rules into executable code.
DRC-Coder incorporates an auto-debugging process, which uses feedback from the code evaluation to refine the generated code.
In Video 2, the demonstration of DRC-Coder generates DRC code that achieves perfect F1 scores on hundreds of testing layouts by leveraging a layout analysis tool, an auto-debugging process, and the capabilities of multi-modality and multi-AI agents.
DRC-Coder achieved a perfect F1 score of 1.000 in generating DRC codes for a sub-3nm technology node, outperforming standard prompting techniques. The proposed automated agentic approach significantly reduces the time required for DRC code generation, from weeks to an average of four minutes per design rule.
Standard cell layout optimization
LLM agent for standard cell layout optimization proposes using the natural language and reasoning ability of an LLM to generate high-quality cluster constraints incrementally to optimize the cell layout PPA and debug the routability with ReAct prompting.
The system uses net information and cell layout analysis to group MOSFET devices into clusters. The AI agent not only achieves up to 19.4% smaller cell area, but also generates 23.5% more LVS and DRC clean cell layouts than the Transformer-based device clustering approach on a set of sequential cells in the industrial 2nm technology node.
Multi-corner multi-mode timing report debug and analysis
The multi-corner multi-mode (MCMM) timing analysis agent uses dynamic task graphs to complete the specification-to-RTL and extract key takeaways of timing reports, respectively.
The MCMM timing analysis agent achieves an average score of 8.33 out of 10, based on evaluations by experienced engineers on a set of industrial cases, and delivers approximately 60x speedups compared to human engineers (Figure 3).

The timing path debug agent finds the problematic net, wire, and constraints through the static timing debugging task graph (Figure 1).
In Table 2, the timing path debug agent resolves 86% of path-level debugging tasks, whereas the standard task solving approach fails to resolve any of the tasks.
Multi Report Task Description | Required AnalyzedSub-Tasks | Standard Task Solving | Timing Path Debug Agent |
Find missing clk signals that have no rise/fall information | max, clk | X | V |
Identify pairs of nets with high RC mismatch | max, wire | X | V |
Detect unusual constraints between victim and its aggressors | max, xtalk, LC | X | V |
Identify unusual RC values between victim and its aggressors | max, wire, xtalk, LC | X | V |
Find the constraints of slowest stages with highest RC values | max, wire, xtalk, LC | X | V |
Compare each timing table for number of stages, point values and timing mismatch | max | X | X |
Task M2 and Task M3 for specific stages in list of paths | max, wire, xtalk, LC | X | V |
Avg Pass-rate | 0% | 86% |
X=Failed to solve the task. V=Solved the task successfully.
Conclusion
The proposed Marco framework enables more flexible and domain-specialized methods for real-time hardware design tasks solving. By using task graph and flexible single-AI and multi-AI agent configurations with domain-specific tools and knowledge, we developed various agents for tasks such as cell layout optimization, Verilog syntax error fixing, Verilog and DRC code generation, and timing debugging on problematic blocks, nets, and wires.
The experimental results show impressive performance and efficiency benefits on utilizing collaborative LLM-based agents for chip design.
The future directions for agent research on hardware design include the following:
- Training LLMs with high-quality hardware design data
- Improving LLM-based agents’ ability for hardware signal and waveform debugging
- Incorporating PPA metrics into the design flow
- Developing more efficient self-learning techniques and memory systems for LLM agents for solving more complex hardware tasks
For more papers and projects on electronic design automation, see the NVIDIA Design Automation Research Group page.
For those interested in the technologies highlighted in the post, here’s a list of relevant papers:
- RTLFixer: Automatically Fixing RTL Syntax Errors with Large Language Models
- /NVlabs/RTLFixer GitHub repo
- VerilogCoder: Autonomous Verilog Coding Agents with Graph-based Planning and Abstract Syntax Tree (AST)-based Waveform Tracing Tool
- /NVlabs/VerilogCoder GitHub repo
- DRC-Coder: Automated DRC Checker Code Generation using LLM Autonomous Agent
- Large Language Model (LLM) for Standard Cell Layout Design Optimization