Hide-and-Seek Framework Detects VLA Robot Failures from Trajectory Labels
Researchers propose Hide-and-Seek, a framework for runtime failure detection in Vision-Language-Action (VLA) models. VLA models allow robots to follow natural language instructions but are prone to execution failures. Existing detection methods are costly or obscure localized signals. Hide-and-Seek formulates failure detection as a coarsely supervised learning problem using inter- and intra-trajectory contrastive objectives, localizing failure-indicative actions from trajectory-level supervision alone, without step-level annotations. The paper is available on arXiv.
Key facts
- Hide-and-Seek is a framework for VLA failure detection.
- It uses coarsely supervised learning with contrastive objectives.
- No step-level annotations are required.
- VLA models are vulnerable to execution failures.
- Existing methods rely on expensive resampling or external models.
- Trajectory-level labels uniformly propagated obscure failure signals.
- The approach localizes failure-indicative actions.
- Paper published on arXiv with ID 2605.30834.
Entities
Institutions
- arXiv