Domain-Adaptive LLMs Improve Crisis Communication Readability
Researchers propose a domain-adaptive pipeline to enhance crisis communication by fine-tuning a small language model on curated parallel data. The approach expands a small reference corpus via retrieval and filtering from general corpora, then applies preference optimization to bias outputs toward CEFR A2-level simplified English. Automatic and human evaluations show improved readability while maintaining adequacy. The study suggests simplified English combined with domain adaptation can serve as a practical lingua franca for emergencies when full multilingual coverage is unavailable.
Key facts
- Pipeline expands small reference corpus by retrieving and filtering data from general corpora
- Fine-tunes a small language model for crisis-domain translation
- Applies preference optimization to bias outputs toward CEFR A2-level English
- Automatic and human evaluation shows improved readability with maintained adequacy
- Simplified English with domain adaptation proposed as lingua franca for emergency communication
- Addresses scarcity of curated parallel data in crisis communication
- Focuses on natural and human-induced disasters
- Published on arXiv under Computer Science > Computation and Language
Entities
Institutions
- arXiv