Consensus Entropy: Multi-VLM Agreement for Self-Verifying OCR
Researchers have introduced Consensus Entropy (CE), a training-free metric that estimates output reliability in Vision-Language Models (VLMs) by measuring inter-model agreement entropy. The method is based on the observation that correct predictions converge in output space while errors diverge. The CE-OCR framework uses ensemble agreement to verify and select the best outputs, improving efficiency through adaptive routing. Experiments show CE improves F1 scores by 42.1% over VLM-as-Judge for quality verification, and CE-OCR outperforms self-consistency and single-model baselines in OCR tasks.
Key facts
- Consensus Entropy (CE) is a training-free, model-agnostic metric.
- CE measures inter-model agreement entropy to estimate output reliability.
- CE-OCR is a lightweight multi-model framework for OCR verification and selection.
- CE-OCR uses adaptive routing to improve efficiency.
- CE improves F1 scores by 42.1% over VLM-as-Judge.
- CE-OCR outperforms self-consistency and single-model baselines.
- The research is published on arXiv (2504.11101).
- OCR is fundamental to VLMs and high-quality data generation for LLM training.
Entities
Institutions
- arXiv