GAP: Geometric Anchor Pre-training Boosts Robot Learning from Few Demonstrations

ai-technology · 2026-05-18

A new approach called Geometric Anchor Pre-training (GAP) has been introduced by researchers to enhance data efficiency in robotic manipulation through visuomotor policy learning. GAP tackles the issue of limited expert demonstrations by regularizing a spatial pooling module that adjusts pre-trained Vision Foundation Models (VFMs) for control tasks. In the absence of GAP, this adapter risks overfitting to irrelevant shortcuts and may lose its geometric foundation when fine-tuned with minimal samples. Prior to downstream imitation learning, GAP pre-trains the pooling layer using a lightweight simulated proxy task, thereby incorporating robustness-oriented inductive biases. This straightforward, action-free method acts as a preparatory phase. The research is available on arXiv with ID 2605.15836.

Key facts

GAP stands for Geometric Anchor Pre-training
GAP is a warm-up stage for visuomotor policy learning
It regularizes the spatial adapter before imitation learning
GAP pre-trains on a lightweight simulated proxy task
It addresses overfitting from scarce expert demonstrations
Pre-trained VFMs improve data efficiency but shift adaptation to a small pooling module
The pooling module can latch onto task-irrelevant shortcuts
The paper is available on arXiv with ID 2605.15836

GAP: Geometric Anchor Pre-training Boosts Robot Learning from Few Demonstrations

Key facts

Entities

Institutions

Sources