Consensus Entropy: Multi-VLM Agreement for Self-Verifying OCR

ai-technology · 2026-05-11

Researchers have introduced Consensus Entropy (CE), a training-free metric that estimates output reliability in Vision-Language Models (VLMs) by measuring inter-model agreement entropy. The method is based on the observation that correct predictions converge in output space while errors diverge. The CE-OCR framework uses ensemble agreement to verify and select the best outputs, improving efficiency through adaptive routing. Experiments show CE improves F1 scores by 42.1% over VLM-as-Judge for quality verification, and CE-OCR outperforms self-consistency and single-model baselines in OCR tasks.

Key facts

Consensus Entropy (CE) is a training-free, model-agnostic metric.
CE measures inter-model agreement entropy to estimate output reliability.
CE-OCR is a lightweight multi-model framework for OCR verification and selection.
CE-OCR uses adaptive routing to improve efficiency.
CE improves F1 scores by 42.1% over VLM-as-Judge.
CE-OCR outperforms self-consistency and single-model baselines.
The research is published on arXiv (2504.11101).
OCR is fundamental to VLMs and high-quality data generation for LLM training.

Consensus Entropy: Multi-VLM Agreement for Self-Verifying OCR

Key facts

Entities

Institutions

Sources