Zero-Shot Confidence Estimation for Small LLMs: When Supervised Baselines Aren't Worth Training

ai-technology · 2026-05-06

A recent study published on arXiv (2605.02241) explores the capability of small language models (7-8B parameters) to accurately assess their own correctness without relying on supervised training data. The analysis evaluates zero-shot confidence indicators against RouteLLM-style supervised benchmarks across three model families and two datasets, comprising 1,000 and 500 queries for each model. The average token log-probability, a zero-shot approach, either meets or surpasses the supervised benchmarks in-distribution (AUROC 0.650-0.714 compared to 0.644-0.676) and significantly outperforms them out-of-distribution (0.717-0.833 versus 0.512-0.564). These insights are vital for local-to-cloud routing methods, suggesting that confidence estimation may not require supervised training, thereby lowering deployment expenses.

Key facts

Study compares zero-shot confidence signals against supervised baselines for small LLMs.
Three 7-8B model families tested on two datasets (1,000 and 500 queries per model).
Average token log-probability matches supervised baselines in-distribution (AUROC 0.650-0.714 vs. 0.644-0.676).
Zero-shot method outperforms supervised baselines out-of-distribution (AUROC 0.717-0.833 vs. 0.512-0.564).
Research published on arXiv with ID 2605.02241.
Implications for cost-effective local-to-cloud query routing.

Zero-Shot Confidence Estimation for Small LLMs: When Supervised Baselines Aren't Worth Training

Key facts

Entities

Institutions

Sources