Temporal Grounding in LLM-Based Autonomous Vehicle Planning
A recent preprint on arXiv (2605.19824) explores the role of temporal grounding in the reasoning processes of autonomous vehicles (AVs) when transitioning from scene analysis to planning, utilizing large language models (LLMs) and large multimodal models (LMMs). The researchers contend that existing AV technologies regard time as a minor aspect, which results in flawed reasoning regarding ongoing actions, ultimately compromising safety and clarity. They introduce three planning architectures that enhance temporal integration and assess their performance on selected subsets of the BDD-X dataset through semantic, syntactic, and logical evaluations. Findings indicate that while temporal conditioning alters reasoning approaches, it does not lead to significant improvements in conventional NLP accuracy metrics.
Key facts
- arXiv preprint 2605.19824
- Focus on temporal grounding in AV scene-to-plan reasoning
- Uses ensembles of LLMs and LMMs
- Proposes three planner architectures with increasing temporal integration
- Evaluated on curated subsets of BDD-X dataset
- Metrics: semantic, syntactic, logical
- Temporal conditioning reshapes reasoning style
- No statistically significant improvements in correctness metrics
Entities
Institutions
- arXiv