When Attention Closes: How LLMs Lose the Thread in Multi-Turn Interaction
Summary
Large language models (LLMs) reliably follow single-turn instructions but degrade in long multi-turn interactions, losing track of initial instructions, persona, and rules. This study provides a mechanistic explanation, proposing a "channel-transition" account where goal-defining tokens become less accessible via attention, while goal-related information may persist in residual representations. Researchers introduce the Goal Accessibility Ratio (GAR) to measure attention to task-defining goal tokens and combine it with sliding-window ablations and residual-stream probes. They found that GAR monotonically declines across all tested architectures (Mistral, Qwen, LLaMA, Mixtral), and forcing attention channel closure via a sliding-window intervention causes predictable behavioral failures. Despite attention loss, some models preserve substantial goal-conditioned behavior, with linear probes recovering per-episode recall outcomes with AUC up to 0.99 from residual representations, indicating architecture-specific encoding depths.
Key takeaway
For AI engineers developing long-horizon conversational agents, your focus should extend beyond merely extending context windows. Prioritize architectural designs that maintain robust goal representations in the residual stream, even after direct attention to initial instructions fades. This ensures consistent instruction following and persona compliance in prolonged interactions, mitigating degradation and improving reliability.
Key insights
LLM multi-turn instruction-following degradation stems from attention channel closure, with residual stream encoding determining post-closure behavior.
Principles
- Attention to goal tokens decays over multi-turn interactions.
- Residual streams can retain goal information even without direct attention.
- Architectural design influences goal information retention capacity.
Method
The Goal Accessibility Ratio (GAR) quantifies attention to goal tokens. Sliding-window ablations causally close the attention channel. Linear probes on residual activations measure goal information encoding capacity.
In practice
- Use GAR as a diagnostic for attention channel openness.
- Implement sliding-window interventions to test channel closure effects.
- Employ linear probes to assess residual stream encoding capacity.
Topics
- Multi-Turn LLM Degradation
- Attention Channel
- Residual Stream
- Goal Accessibility Ratio
- Sliding-Window Intervention
Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.