ARTFEED — Contemporary Art Intelligence

VGAS: Value-Guided Action-Chunk Selection for Few-Shot VLA Adaptation

ai-technology · 2026-05-25

Researchers propose VGAS (Value-Guided Action-chunk Selection), a framework for few-shot adaptation of Vision-Language-Action (VLA) models. VLA models integrate multimodal reasoning with physical control but struggle to adapt to new tasks with limited demonstrations due to geometric ambiguities. VGAS addresses this by using a fine-tuned VLA as a high-recall proposal generator and a Transformer critic called Q-Chunk-Former to select geometrically precise action chunks at inference time via best-of-N selection. The approach aims to improve both semantic faithfulness and geometric precision. The paper is available on arXiv (2602.07399).

Key facts

  • VGAS stands for Value-Guided Action-chunk Selection.
  • It targets few-shot adaptation of Vision-Language-Action (VLA) models.
  • VLA models bridge multimodal reasoning with physical control.
  • Adaptation with scarce demonstrations is unreliable due to geometric ambiguities.
  • VGAS uses a fine-tuned VLA as a high-recall proposal generator.
  • It employs a Transformer critic called Q-Chunk-Former.
  • Selection is done via inference-time best-of-N selection.
  • The paper is on arXiv with ID 2602.07399.

Entities

Institutions

  • arXiv

Sources