Capability-Oriented Failure Attribution for VLN Agents
A novel testing strategy focused on capabilities for Vision-Language Navigation (VLN) agents allows for the identification and analysis of failures. This is achieved through a combination of adaptive test case creation using seed selection and mutation, capability oracles to detect errors specific to capabilities, and a feedback system that links failures to specific capabilities, facilitating further test development. Experimental results indicate that this approach uncovers a greater number of failure instances and more precisely identifies deficiencies at the capability level compared to leading baseline methods, offering clearer and more actionable insights for enhancing embodied agents in safety-sensitive environments.
Key facts
- arXiv:2604.25161v1
- Proposes capability-oriented testing for VLN agents
- Combines adaptive test case generation, capability oracles, and feedback mechanism
- Outperforms state-of-the-art baselines in failure detection and attribution
- Focuses on safety-critical applications
- Capabilities include perception, memory, planning, decision
Entities
Institutions
- arXiv