ARTFEED — Contemporary Art Intelligence

Confidence-Aware Training Boosts Medical ASR for Dravidian Languages

other · 2026-04-24

A new confidence-aware training framework improves automatic speech recognition (ASR) for low-resource Dravidian languages Telugu and Kannada in medical domains. The approach integrates real and synthetic speech via a hybrid confidence mechanism combining static perceptual/acoustic similarity metrics with dynamic model entropy. Two aggregation strategies—fixed-weight and learnable-weight—guide sample weighting during training. Evaluation on medical datasets with real recordings and TTS-generated speech, plus a 5-gram KenLM language model for post-decoding correction, shows performance gains.

Key facts

  • Focus on Telugu and Kannada languages
  • Medical domain ASR
  • Hybrid confidence mechanism with static and dynamic metrics
  • Fixed-weight and learnable-weight aggregation strategies
  • Evaluation on real and TTS-generated synthetic speech
  • 5-gram KenLM language model for post-decoding correction
  • Addresses limited annotated data and morphological complexity
  • Proposed framework outperforms direct fine-tuning

Entities

Sources