PostEDA-Bench: New Benchmark for LLM-Based Circuit Design Repair
Researchers have introduced PostEDA-Bench, a hierarchical benchmark for evaluating LLM-based agents in the 'last mile' of electronic design automation (EDA), specifically for repairing sign-off Design Rule Check (DRC) violations and optimizing Power-Performance-Area (PPA) targets. The benchmark comprises 145 tasks across four categories: DRC-Essential, DRC-Reasoning, PPA-Mono, and PPA-Multi, and is supported by EDA toolchains with machine-checkable evaluation. Testing on eight commercial and open-source LLMs under multiple agent scaffolds revealed that agents perform reasonably well on synthetic DRC-Essential and single-objective PPA-Mono tasks but struggle with more practical DRC-Reasoning (best success rate 36.66%) and PPA-Multi (best success rate 20.00%). Vision augmentation consistently improved DRC-Bench performance. The work highlights the need for better trade-off reasoning in multi-objective optimization.
Key facts
- PostEDA-Bench is a hierarchical benchmark for LLM-based EDA last-mile tasks.
- It includes 145 tasks across DRC-Essential, DRC-Reasoning, PPA-Mono, and PPA-Multi.
- Existing EDA-LLM benchmarks omit DRC fixing and rely on flat hierarchies.
- Eight commercial and open-source LLMs were tested under multiple agent scaffolds.
- Best success rate on DRC-Reasoning is 36.66%.
- Best success rate on PPA-Multi is 20.00%.
- Vision augmentation consistently enhances DRC-Bench performance.
- The benchmark uses machine-checkable evaluation with EDA toolchains.
Entities
—