InfoTree: A Submodular Framework for Tool-Use Agentic RL Under Fixed Budget

other · 2026-05-09

A recent study available on arXiv introduces Rollout Informativeness under a Fixed Budget (RIFB), aimed at enhancing reinforcement learning for agents utilizing tools. The researchers observed that samplers operating independently, without acknowledging budget constraints, tended to exhibit a higher than zero collapse rate with difficult prompts. By reinterpreting the selection of intermediate states as a monotone submodular maximization challenge, they provided a method that achieved an approximation guarantee of 1 - 1/e through a greedy strategy. Additionally, the InfoTree framework integrates Uncertainty-aware Upper Confidence Bound (UUCB) terms with an Adaptive Budget Allocator (ABA) to improve prompt optimization under specific budgets.

Key facts

Paper published on arXiv with ID 2605.05262
Formalizes Rollout Informativeness under a Fixed Budget (RIFB)
Proves budget-agnostic independent samplers collapse for hard prompts
Recasts state selection as monotone submodular maximization
Greedy selector achieves 1 - 1/e approximation guarantee
UUCB terms derived as closed-form marginal gains
InfoTree framework includes UUCB, ABA, and Speculative Expansion
Token-level entropy bonus is an analytic consequence

InfoTree: A Submodular Framework for Tool-Use Agentic RL Under Fixed Budget

Key facts

Entities

Institutions

Sources