New Method Detects Hallucinations in Speech Large Language Models Using Attention Metrics
A recent study presents a novel technique for identifying hallucinations in Speech Large Language Models (SpeechLLMs) during inference by examining attention patterns. This method employs four metrics derived from attention—AUDIORATIO, AUDIOCONSISTENCY, AUDIOENTROPY, and TEXTENTROPY—to pinpoint pathological attention linked to hallucinations. By utilizing lightweight logistic regression classifiers trained on these metrics, the detection process becomes efficient without the need for expensive gold-standard outputs. Tests conducted on Qwen-2-Audio and Voxtral-3B models in automatic speech recognition and speech-to-text translation tasks demonstrate that this technique surpasses both uncertainty-based and prior attention-based benchmarks on in-domain data, achieving enhancements of up to +0.23 PR-AUC. It also effectively generalizes to out-of-domain ASR contexts, addressing a critical issue in SpeechLLMs where audio-specific signals evade detection by text-based LLM methods. This research is available on arXiv with identifier arXiv:2604.19565v1, categorized as a cross announcement, emphasizing its practical use for real-time hallucination detection in speech models.
Key facts
- Hallucinations in Speech Large Language Models pose significant risks
- Existing detection methods often rely on costly gold-standard outputs
- Text-based LLM hallucination detection methods don't capture audio-specific signals
- Four attention-derived metrics were investigated: AUDIORATIO, AUDIOCONSISTENCY, AUDIOENTROPY, TEXTENTROPY
- Lightweight logistic regression classifiers were trained on these features for inference-time detection
- Evaluations used Qwen-2-Audio and Voxtral-3B models
- Method outperformed uncertainty-based and prior attention-based baselines on in-domain data
- Achieved improvements of up to +0.23 PR-AUC and generalized to out-of-domain ASR settings
Entities
Institutions
- arXiv