ArborKV: Structure-Aware KV Cache Management for Tree-based LLM Reasoning
A recent study published on arXiv (2605.22106) presents ArborKV, a Key-Value (KV) cache eviction framework that is structure-aware, aimed at mitigating memory limitations in Tree-of-Thoughts (ToT) LLM reasoning. In ToT, inference is structured as a tree search involving branching and backtracking; however, maintaining KV states for a frontier of incomplete trajectories poses a memory challenge. ArborKV utilizes search dynamics: immediate decoding relies on the current branch and its predecessors, while inactive subtrees, which have a low probability of short-term reuse, must still be retrievable. This framework integrates a lightweight value estimator with a tree-focused allocation strategy for eviction, facilitating increased search depth and width within fixed hardware constraints.
Key facts
- Paper arXiv:2605.22106 proposes ArborKV.
- ArborKV is a structure-aware KV cache eviction framework.
- Tree-of-Thoughts (ToT) organizes inference as tree-structured search.
- KV cache retention for partial trajectories creates memory bottlenecks.
- ArborKV uses a lightweight value estimator and tree-aware allocation.
- Inactive subtrees have low short-term reuse probability.
- Active branch and its ancestors are prioritized for decoding.
- ArborKV aims to increase search depth and width under fixed hardware budgets.
Entities
Institutions
- arXiv