Bits-over-Random: A New Measure Exposing Random-Level Retrieval Despite 99% Success
A recent paper on arXiv presents Bits-over-Random (BoR), a measure that adjusts for chance in assessing retrieval selectivity, highlighting instances where high success rates might disguise random performance. The authors contend that while conventional information retrieval systems are designed for human users capable of sifting through irrelevant results, large language models do not possess this filtering capability, thus requiring more refined outputs. BoR is calculated as log2(P_obs/P_rand), with P_rand representing the hypergeometric baseline for a specified success criterion (such as having at least one relevant document in the top-K). In experiments with the 20 Newsgroups dataset, both BM25 and SPLADE exceed 99% success at K=100, yet BoR approaches zero, suggesting that nearly perfect retrieval mirrors random selection. The paper can be found on arXiv with the identifier 2605.18857.
Key facts
- Paper introduces Bits-over-Random (BoR) as a chance-corrected measure of retrieval selectivity.
- BoR is defined as log2(P_obs/P_rand).
- P_rand is the hypergeometric baseline for coverage (≥1 relevant in top-K).
- Tested on 20 Newsgroups dataset with BM25 and SPLADE.
- Both BM25 and SPLADE report >99% success at K=100.
- BoR ≈ 0 indicates performance equivalent to random selection.
- Authors argue LLMs lack human filtering ability, requiring cleaner retrieval.
- Paper available on arXiv: 2605.18857.
Entities
Institutions
- arXiv