Attention Head Imbalance Drives Modality-Conflict Hallucination in MLLMs

ai-technology · 2026-05-20

A new study on arXiv (2605.19250) investigates modality-conflict hallucination in multimodal large language models (MLLMs), where models prioritize incorrect textual premises over contradictory visual evidence. Using head-level causal analysis via path patching across five open-source MLLMs, researchers identified two groups of attention heads: hallucination-driving and hallucination-resisting. They found a consistent asymmetry: driving effects are broadly distributed with greater aggregate weight, while resisting effects concentrate in a few high-importance heads. Ablation experiments confirmed that distributed driving influence and localized resistance create an imbalanced routing structure that leads to hallucination. The study provides causal evidence for this imbalance, offering a mechanistic explanation for why visual evidence fails to prevail during generation.

Key facts

Study examines modality-conflict hallucination in MLLMs
Uses head-level causal analysis via path patching
Analyzed five open-source MLLMs
Identified hallucination-driving and hallucination-resisting attention heads
Driving effects are broadly distributed and carry greater aggregate weight
Resisting effects concentrate in a small number of high-importance heads
Ablation experiments confirm opposing effects during generation
Imbalanced routing structure underlies hallucination

Attention Head Imbalance Drives Modality-Conflict Hallucination in MLLMs

Key facts

Entities

Institutions

Sources