SCRIBE: Diagnostic Framework for Indic ASR Error Analysis

publication · 2026-05-22

A new diagnostic framework called SCRIBE has been developed by researchers for automatic speech recognition (ASR). This framework categorizes errors into four distinct types: lexical, punctuation, numeral, and domain-entity. To overcome the shortcomings of word error rate (WER), which tends to merge error types and unfairly affects agglutinative languages such as Hindi, Malayalam, and Kannada, SCRIBE employs sandhi-tolerant alignment and incorporates domain-specific vocabulary. Validation by human experts indicates that SCRIBE’s assessments are more aligned with expert opinions compared to WER. The release features a curation pipeline for LLM, benchmarks, and open-weight transcription models for the three languages. This research is available on arXiv in the fields of computer science and language computation.

Key facts

SCRIBE provides categorical error decomposition for ASR.
Error categories include lexical, punctuation, numeral, and domain-entity rates.
Sandhi-tolerant alignment addresses agglutinative language issues.
Domain vocabulary injection improves domain-specific recognition.
Human validation confirms SCRIBE aligns with expert judgment.
WER fails by collapsing error types and penalizing agglutinative languages.
Open-weight rich transcription models released for Hindi, Malayalam, and Kannada.
SCRIBE includes an LLM curation pipeline and benchmarks.

SCRIBE: Diagnostic Framework for Indic ASR Error Analysis

Key facts

Entities

Institutions

Sources