Task Vector Distribution Alignment Improves In-Context Learning Efficiency
A recent paper on arXiv suggests using distributional alignment as a standard for creating task vectors in in-context learning (ICL) for large language models (LLMs). Task vectors aim to condense demonstrations into efficient hidden-state representations, thereby lowering inference expenses. However, earlier assessments focused exclusively on the accuracy of downstream tasks. The researchers present d_NTP, a metric that assesses the difference in next-token probabilities between inference based on task vectors and that based on ICL. Their findings indicate a significant negative correlation between d_NTP and downstream accuracy, which can help in developing more efficient task vector extraction techniques to tackle rising inference costs associated with longer context lengths in ICL.
Key facts
- Paper introduces distributional alignment as a criterion for designing task vectors in ICL.
- Task vectors compress demonstrations into compact hidden-state representations.
- Previous evaluations relied only on downstream task accuracy.
- New metric d_NTP measures discrepancy in next-token probabilities.
- d_NTP shows strong negative correlation with downstream accuracy.
- Aims to reduce escalating inference costs of ICL.
- Published on arXiv with ID 2605.20730.
- Empirical analysis validates d_NTP as a performance proxy.
Entities
Institutions
- arXiv