Self-Improving WA* Learning with Relational Graph Neural Networks for Planning
A new paper on arXiv (2605.25720) proposes a self-improving WA* learning framework combined with a Relational Graph Neural Network (RGNN) value heuristic to address combinatorial generalization in Deep Reinforcement Learning (DRL). The approach uses best-first search methods like A* to solve planning problems from scratch, without relying on expert demonstrations or random walks. The heuristic guides search, and resulting search data updates the heuristic via Q-learning, creating a loop that yields general policies capable of solving new instances. The work highlights the challenge of sparse-reward domains where standard RL exploration is ineffective.
Key facts
- arXiv paper 2605.25720 proposes a self-improving WA* learning framework with RGNN
- Addresses combinatorial generalization in Deep Reinforcement Learning
- Uses best-first search methods like A* for planning
- Heuristic guides search, search data updates heuristic via Q-learning
- Aims to solve new instances without expert demonstrations or random walks
- Focuses on sparse-reward domains in planning
- Relational Graph Neural Network represents the value heuristic
- Published on arXiv with announcement type new
Entities
Institutions
- arXiv