Reinforcement Learning Achieves Expert-Level Chip Placement via Reward Learning

ai-technology · 2026-04-30

A new reinforcement learning framework for chip placement achieves expert-level layouts by learning from expert designs rather than optimizing wirelength alone. Researchers identified reward design as the key gap between RL and human experts. Their method infers step-by-step expert trajectories from final layouts, using them as demonstrations to train a reward model that captures latent implicit rewards. Experiments show the framework learns efficiently from even a single design and generalizes well to unseen cases. The work addresses a critical step in physical design, where prior RL methods often failed to match expert quality.

Key facts

Chip placement is a critical step in physical design.
Existing RL-based methods focus on wirelength optimization and often fail to achieve expert-quality layouts.
Reward design is identified as the primary cause for the performance gap with experts.
The new approach learns directly from expert layouts to derive a reward model.
The method infers step-by-step expert trajectories from final expert layouts.
Trajectories are used as demonstrations or preferences to train a model capturing latent implicit rewards.
The framework can learn efficiently from even a single design.
The framework generalizes well to unseen cases.

Reinforcement Learning Achieves Expert-Level Chip Placement via Reward Learning

Key facts

Entities

Institutions

Sources