LLMs Lose Focus Over Multi-Turn Interactions: Attention Degradation Study
A recent investigation published on arXiv (2605.12922) examines the reasons behind large language models (LLMs) losing track of instructions, persona, and rules during extended multi-turn conversations. The researchers suggest a channel-transition theory, indicating that goal-defining tokens become less accessible due to attention shifts, although goal-related information may linger in residual representations. They present the Goal Accessibility Ratio (GAR) to assess the attention from generated tokens to task-defining goal tokens, utilizing sliding-window ablations and residual-stream probes. The findings reveal that different architectures exhibit distinct failure modes: some maintain goal-conditioned behavior even with diminished attention, while others falter despite having decodable residual goal information, with the encoding layer differing across models. This study offers a mechanistic insight into previously noted but unexplained behavioral decline.
Key facts
- Study on arXiv with ID 2605.12922
- Investigates LLM degradation in multi-turn interactions
- Proposes channel-transition account for attention loss
- Introduces Goal Accessibility Ratio (GAR) metric
- Uses sliding-window ablations and residual-stream probes
- Finds distinct failure modes across architectures
- Some models maintain behavior despite vanishing attention
- Other models fail even with decodable residual goal info
Entities
Institutions
- arXiv