ARTFEED — Contemporary Art Intelligence

Active Learning Improves LLM-Based Pairwise Ranking

other · 2026-05-16

Researchers suggest reinterpreting Pairwise Ranking Prompting (PRP) for large language models (LLMs) as an active learning challenge stemming from noisy pairwise comparisons. Conventional PRP relies on sorting algorithms to compile pairwise preferences; however, the judgments can be noisy, sensitive to order, and intransitive, rendering sorting assumptions invalid. The proposed framework features a randomized-direction oracle that utilizes one LLM call for each pair, transforming systematic position bias into zero-mean noise. This approach allows for unbiased aggregate ranking without the need for bidirectional calls. Active rankers can seamlessly replace existing methods, enhancing NDCG@10 per call in scenarios with call limitations.

Key facts

  • PRP elicits pairwise preference judgments from an LLM
  • Judgments are noisy, order-sensitive, and sometimes intransitive
  • Sorting aims to recover a full permutation
  • Truncating sorting to meet a call budget does not produce dependable top-K
  • Active rankers are drop-in replacements for sorting
  • Active rankers improve NDCG@10 per call
  • Randomized-direction oracle uses a single LLM call per pair
  • Approach converts systematic position bias into zero-mean noise

Entities

Institutions

  • arXiv

Sources