KoALa-Bench: New Benchmark for Korean Audio Language Models
KoALa-Bench serves as an extensive benchmark aimed at assessing large audio language models (LALMs) in their comprehension and accuracy regarding Korean speech. It features six distinct tasks: four focused on essential speech understanding—automatic speech recognition, speech translation, speech question answering, and following speech instructions—and two that evaluate speech faithfulness, targeting the common issue of LALMs not fully leveraging the speech modality. Additionally, the benchmark includes knowledge pertinent to Korea, such as listening questions from the Korean college scholastic ability test, filling a gap in the availability of non-English benchmarks for LALMs, especially for the Korean language.
Key facts
- KoALa-Bench evaluates Korean speech understanding and faithfulness of LALMs.
- The benchmark includes six tasks: four for speech understanding and two for faithfulness.
- Tasks cover automatic speech recognition, speech translation, speech question answering, and speech instruction following.
- Speech faithfulness tasks address LALMs' failure to fully leverage speech modality.
- The benchmark incorporates Korea-specific knowledge from the Korean college scholastic ability test.
- It addresses the lack of non-English benchmarks for LALMs, especially Korean.
- The work is published on arXiv with ID 2604.19782.
- The announcement type is cross.
Entities
Institutions
- arXiv