V-VLAPS: Value-Guided Planning Improves VLA Models for Robotics
Researchers introduce V-VLAPS (Value-Guided Vision-Language-Action Planning and Search), a method that enhances vision-language-action (VLA) models for robotic manipulation by adding a lightweight value head trained on offline rollouts. This value head predicts Monte Carlo returns, guiding Monte Carlo Tree Search toward higher-value branches. The approach addresses failures in reactive VLA policies under distribution shift and long-horizon tasks, where prior planning methods relied on policy priors and visit-count exploration without learned value signals. The work builds on findings that VLA representations encode rollout success and failure, enabling value estimation during planning.
Key facts
- V-VLAPS augments VLA-guided planning with a lightweight value head
- Value head is trained on offline VLA rollouts to predict Monte Carlo returns
- Predictions guide Monte Carlo Tree Search toward higher-value branches
- Reactive VLA policies fail under distribution shift and long-horizon tasks
- Prior planning methods lacked learned value signals to correct poor policy actions
- VLA representations encode rollout success and failure information
- Method aims to improve robotic manipulation execution
- Paper available on arXiv with ID 2601.00969
Entities
Institutions
- arXiv