ARTFEED — Contemporary Art Intelligence

Bits-over-Random Metric Optimizes LLM Tool Selection

ai-technology · 2026-05-26

A new metric, Bits-over-Random (BoR), evaluates the optimal number of tools shown to an LLM agent during retrieval. Fixed shortlist sizes often fail: too many tools confuse the model, too few omit the correct one. BoR measures whether success at a given depth exceeds random chance. Tested across three benchmarks with registries of 20 to 3,251 tools, BoR also serves as a reinforcement learning reward for per-query depth selection. The RL agent remains deliberately simple to probe the metric's effectiveness.

Key facts

  • BoR is a chance-corrected metric for tool shortlist depth.
  • Fixed shortlist sizes are suboptimal for LLM tool retrieval.
  • BoR compares success at a given depth to random selection.
  • Evaluated on three tool-selection benchmarks.
  • Tool registries range from 20 to 3,251 tools.
  • BoR is used as a reinforcement learning reward.
  • The RL agent is deliberately simple.
  • The approach treats tool count as the object of evaluation.

Entities

Institutions

  • arXiv

Sources