ARTFEED — Contemporary Art Intelligence

OrcaRouter: Hybrid LLM Router Achieves 72.08 Arena Score

ai-technology · 2026-06-01

Researchers have introduced OrcaRouter, a new large language model router designed for production use. It features a LinUCB-based contextual bandit system that employs both lexical and sentence-embedding techniques. The model operates on a hybrid learning strategy, blending offline and online methods. Initially, during the offline stage, OrcaRouter evaluates each possible model against a specially curated set of routing prompts to collect detailed feedback, forming a reward matrix that aids in fitting a ridge regressor for each option. Once in action, it uses these initial parameters and adapts based on feedback, updating only the selected model’s arm. On May 20, 2026, OrcaRouter-Adaptive achieved second place in the RouterArena with a score of 72.08 and 75.54% accuracy.

Key facts

  • OrcaRouter is a production-oriented LLM router
  • It uses a LinUCB-based contextual bandit over lexical and sentence-embedding features
  • Employs a hybrid offline-online learning protocol
  • Offline training evaluates each candidate model on curated routing prompts
  • Generates a reward matrix to fit one ridge regressor per arm
  • At deployment, initializes from offline parameters and can continue learning
  • OrcaRouter-Adaptive ranked second on RouterArena leaderboard as of May 20, 2026
  • Achieved arena score of 72.08 and 75.54% accuracy

Entities

Institutions

  • arXiv
  • RouterArena

Sources