GAP: Geometric Anchor Pre-training Boosts Robot Learning from Few Demonstrations
A new approach called Geometric Anchor Pre-training (GAP) has been introduced by researchers to enhance data efficiency in robotic manipulation through visuomotor policy learning. GAP tackles the issue of limited expert demonstrations by regularizing a spatial pooling module that adjusts pre-trained Vision Foundation Models (VFMs) for control tasks. In the absence of GAP, this adapter risks overfitting to irrelevant shortcuts and may lose its geometric foundation when fine-tuned with minimal samples. Prior to downstream imitation learning, GAP pre-trains the pooling layer using a lightweight simulated proxy task, thereby incorporating robustness-oriented inductive biases. This straightforward, action-free method acts as a preparatory phase. The research is available on arXiv with ID 2605.15836.
Key facts
- GAP stands for Geometric Anchor Pre-training
- GAP is a warm-up stage for visuomotor policy learning
- It regularizes the spatial adapter before imitation learning
- GAP pre-trains on a lightweight simulated proxy task
- It addresses overfitting from scarce expert demonstrations
- Pre-trained VFMs improve data efficiency but shift adaptation to a small pooling module
- The pooling module can latch onto task-irrelevant shortcuts
- The paper is available on arXiv with ID 2605.15836
Entities
Institutions
- arXiv