Bits-over-Random Metric Optimizes LLM Tool Selection

ai-technology · 2026-05-26

A new metric, Bits-over-Random (BoR), evaluates the optimal number of tools shown to an LLM agent during retrieval. Fixed shortlist sizes often fail: too many tools confuse the model, too few omit the correct one. BoR measures whether success at a given depth exceeds random chance. Tested across three benchmarks with registries of 20 to 3,251 tools, BoR also serves as a reinforcement learning reward for per-query depth selection. The RL agent remains deliberately simple to probe the metric's effectiveness.

Key facts

BoR is a chance-corrected metric for tool shortlist depth.
Fixed shortlist sizes are suboptimal for LLM tool retrieval.
BoR compares success at a given depth to random selection.
Evaluated on three tool-selection benchmarks.
Tool registries range from 20 to 3,251 tools.
BoR is used as a reinforcement learning reward.
The RL agent is deliberately simple.
The approach treats tool count as the object of evaluation.

Bits-over-Random Metric Optimizes LLM Tool Selection

Key facts

Entities

Institutions

Sources