ARTFEED — Contemporary Art Intelligence

IndicMedDialog: Multilingual Medical Dialogue Dataset for Indic Languages

ai-technology · 2026-05-14

IndicMedDialog has launched a multilingual medical dialogue dataset featuring English and nine Indic languages. This dataset extends the existing MDDial by incorporating synthetic consultations generated through large language models. Translations were carried out using TranslateGemma and verified by native speakers. A script-aware post-processing pipeline was employed to correct any errors. Additionally, IndicMedLM was fine-tuned through parameter-efficient adaptation, and it allows for optional patient pre-context inclusion. The model's performance was evaluated against zero-shot multilingual baselines, showcasing its advancements in AI technology for healthcare communication.

Key facts

  • Dataset covers English and nine Indic languages
  • Extends MDDial with LLM-generated synthetic consultations
  • Translations performed using TranslateGemma
  • Native speakers verified translations
  • Script-aware post-processing pipeline corrects errors
  • IndicMedLM fine-tuned via parameter-efficient adaptation
  • Model incorporates optional patient pre-context
  • Evaluated against zero-shot multilingual baselines

Entities

Sources