PostEDA-Bench: New Benchmark for LLM-Based Circuit Design Repair

ai-technology · 2026-05-11

Researchers have introduced PostEDA-Bench, a hierarchical benchmark for evaluating LLM-based agents in the 'last mile' of electronic design automation (EDA), specifically for repairing sign-off Design Rule Check (DRC) violations and optimizing Power-Performance-Area (PPA) targets. The benchmark comprises 145 tasks across four categories: DRC-Essential, DRC-Reasoning, PPA-Mono, and PPA-Multi, and is supported by EDA toolchains with machine-checkable evaluation. Testing on eight commercial and open-source LLMs under multiple agent scaffolds revealed that agents perform reasonably well on synthetic DRC-Essential and single-objective PPA-Mono tasks but struggle with more practical DRC-Reasoning (best success rate 36.66%) and PPA-Multi (best success rate 20.00%). Vision augmentation consistently improved DRC-Bench performance. The work highlights the need for better trade-off reasoning in multi-objective optimization.

Key facts

PostEDA-Bench is a hierarchical benchmark for LLM-based EDA last-mile tasks.
It includes 145 tasks across DRC-Essential, DRC-Reasoning, PPA-Mono, and PPA-Multi.
Existing EDA-LLM benchmarks omit DRC fixing and rely on flat hierarchies.
Eight commercial and open-source LLMs were tested under multiple agent scaffolds.
Best success rate on DRC-Reasoning is 36.66%.
Best success rate on PPA-Multi is 20.00%.
Vision augmentation consistently enhances DRC-Bench performance.
The benchmark uses machine-checkable evaluation with EDA toolchains.

Entities

—

Sources

arXiv cs.AI — 2026-05-11