New Method Detects Hallucinations in LLM Reasoning Steps
Researchers propose a method to detect hallucinations in large language models during multi-step reasoning by analyzing hidden-state trajectories. The approach uses a label-conditioned teacher to build a contrastive PCA lens and a BiLSTM student for deployment. It identifies the first error as a localized excursion in transport cost from a stable manifold of coherent transitions.
Key facts
- arXiv:2605.13772v1
- Hallucination detection at step level
- Hidden-state trajectory analysis
- Contrastive PCA lens
- BiLSTM student model
- Single forward pass required
- Transport-separation objective
- First error localization
Entities
Institutions
- arXiv