Causal Decomposition of Function Vectors in Few-Shot In-Context Learning
A recent study published on arXiv (2605.16591) delves into the mechanisms behind in-context learning (ICL) in large language models, specifically examining how few-shot examples influence the function vector (FV), which directs task performance. The researchers found that an n-shot FV can be effectively represented as a linear combination of sub-FVs derived from individual examples, highlighting the additive and composable nature of these contributions. Furthermore, they discovered that models adjust their representations based on previous examples, prioritizing more informative and clearer demonstrations. A causal analysis distinguishes Query-Key routing from Value updates, indicating that the primary enhancements to FV quality from contextualization stem from Query-Key alignment, especially in ambiguous contexts.
Key facts
- arXiv paper 2605.16591 analyzes function vectors in in-context learning.
- n-shot FV approximates a linear combination of example-level sub-FVs.
- Models adaptively reweight demonstrations based on informativeness and ambiguity.
- Causal decomposition separates Query-Key routing from Value updates.
- Query-Key alignment contributes most to FV quality in ambiguous settings.
- Study covers multiple tasks and models.
- Research provides mechanistic explanation of few-shot prompting.
- Findings highlight additive and contextualized nature of FV composition.
Entities
Institutions
- arXiv