Self-Improving WA* Learning with Relational Graph Neural Networks for Planning

other · 2026-05-26

A new paper on arXiv (2605.25720) proposes a self-improving WA* learning framework combined with a Relational Graph Neural Network (RGNN) value heuristic to address combinatorial generalization in Deep Reinforcement Learning (DRL). The approach uses best-first search methods like A* to solve planning problems from scratch, without relying on expert demonstrations or random walks. The heuristic guides search, and resulting search data updates the heuristic via Q-learning, creating a loop that yields general policies capable of solving new instances. The work highlights the challenge of sparse-reward domains where standard RL exploration is ineffective.

Key facts

arXiv paper 2605.25720 proposes a self-improving WA* learning framework with RGNN
Addresses combinatorial generalization in Deep Reinforcement Learning
Uses best-first search methods like A* for planning
Heuristic guides search, search data updates heuristic via Q-learning
Aims to solve new instances without expert demonstrations or random walks
Focuses on sparse-reward domains in planning
Relational Graph Neural Network represents the value heuristic
Published on arXiv with announcement type new

Self-Improving WA* Learning with Relational Graph Neural Networks for Planning

Key facts

Entities

Institutions

Sources