Study Reveals Reasoning Fragility in Autonomous Driving VLAs Under Sensor Perturbations
A recent study published on arXiv (2605.21446) examines the resilience of Vision-Language-Action (VLA) models in the context of autonomous driving when faced with real-world sensor impairments. Researchers performed a systematic perturbation analysis on Alpamayo R1, which has 10B parameters, across 1,996 different scenarios. They introduced eight types of sensor disruptions, including Gaussian noise at four levels, two extremes of lighting, and two fog conditions, resulting in nearly 18,000 inference tests. Notable results indicate that the consistency of reasoning serves as a reliable predictor of trajectory accuracy: when Chain-of-Causation (CoC) explanations are altered due to perturbations, trajectory deviation increases by 5.3 times (21.8m compared to 4.1m), with a correlation coefficient of r=0.99 across various attack types and a point-biserial correlation of r_pb=0.53 per sample (Cohen's d=1.12). Furthermore, a controlled ablation study suggests that the activation of CoC generation enhances trajectory precision by 11.8%.
Key facts
- Study evaluates Alpamayo R1 (10B parameters) VLA model for autonomous driving
- 1,996 scenarios tested under 8 sensor perturbations
- Approximately 18,000 inference trials conducted
- Reasoning consistency is high-fidelity indicator of trajectory reliability
- Trajectory deviation spikes 5.3x when CoC explanations change
- Correlation coefficient r=0.99 across attack types
- Cohen's d=1.12 for per-sample effect size
- Enabling CoC generation improves trajectory accuracy by 11.8%
Entities
Institutions
- arXiv