CapKV: Capacity-Aware KV Cache Eviction via Information Bottleneck
A recent study published on arXiv introduces CapKV, a novel key-value cache eviction strategy that is informed by the Information Bottleneck principle. Utilizing a linear-Gaussian approximation of attention, the researchers formulate a closed-form mutual information objective that defines the effective information capacity of a selected subset of the KV cache. This approach demonstrates that numerous current eviction methods are merely approximations of the capacity-maximization concept. CapKV aims to enhance information retention through a log-determinant approximation based on statistical leverage scores, moving away from heuristic methods. This research tackles the memory limitations associated with KV caching during long-context LLM inference.
Key facts
- Paper title: Rethinking KV Cache Eviction via a Unified Information-Theoretic Objective
- arXiv ID: 2604.25975
- Announce Type: cross
- Proposes CapKV, a capacity-aware eviction method
- Uses Information Bottleneck principle
- Derives closed-form mutual information objective under linear-Gaussian surrogate
- Existing eviction strategies are approximations of capacity-maximization
- CapKV uses log-determinant approximation with leverage scores
Entities
Institutions
- arXiv