LLM Representational Curvature Linked to Behavioral Uncertainty
A recent investigation published on arXiv (2604.23985) reveals a clear connection between the geometric characteristics of internal representations in large language models and the uncertainty at the token level in their behavior. The study highlights that contextual curvature—indicative of the sharpness of the representational path based on recent context—shows a correlation with next-token entropy in both GPT-2 XL and Pythia-2.8B. This correlation develops throughout the training process. Experiments involving perturbations indicate that interventions aligned with the trajectory effectively influence entropy, whereas those that are misaligned do not yield any impact. This research establishes a mechanistic link between the geometry of representations and the behavior of the model, broadening the idea of temporal straightening to predictions at the token level.
Key facts
- Study links contextual curvature to next-token entropy in LLMs
- Models tested: GPT-2 XL and Pythia-2.8B
- Relationship emerges during training
- Trajectory-aligned perturbations modulate entropy
- Geometrically misaligned perturbations have no effect
- Published on arXiv: 2604.23985
- Extends temporal straightening framework
- Provides direct link between representation geometry and behavior
Entities
Institutions
- arXiv