ArborKV: Structure-Aware KV Cache Management for Tree-based LLM Reasoning

ai-technology · 2026-05-23

A recent study published on arXiv (2605.22106) presents ArborKV, a Key-Value (KV) cache eviction framework that is structure-aware, aimed at mitigating memory limitations in Tree-of-Thoughts (ToT) LLM reasoning. In ToT, inference is structured as a tree search involving branching and backtracking; however, maintaining KV states for a frontier of incomplete trajectories poses a memory challenge. ArborKV utilizes search dynamics: immediate decoding relies on the current branch and its predecessors, while inactive subtrees, which have a low probability of short-term reuse, must still be retrievable. This framework integrates a lightweight value estimator with a tree-focused allocation strategy for eviction, facilitating increased search depth and width within fixed hardware constraints.

Key facts

Paper arXiv:2605.22106 proposes ArborKV.
ArborKV is a structure-aware KV cache eviction framework.
Tree-of-Thoughts (ToT) organizes inference as tree-structured search.
KV cache retention for partial trajectories creates memory bottlenecks.
ArborKV uses a lightweight value estimator and tree-aware allocation.
Inactive subtrees have low short-term reuse probability.
Active branch and its ancestors are prioritized for decoding.
ArborKV aims to increase search depth and width under fixed hardware budgets.

ArborKV: Structure-Aware KV Cache Management for Tree-based LLM Reasoning

Key facts

Entities

Institutions

Sources