MIMO: New Framework for Multilingual Information Retrieval
Researchers propose MIMO (Multilingual Information Retrieval via Monolingual Objectives), a two-stage framework to improve cross-lingual retrieval. Existing embedding models optimized for multi-monolingual retrieval degrade in MLIR settings, and contrastive learning can cause language clustering. MIMO uses a stable English semantic space from a teacher model as an anchor, initializes student model alignment via knowledge distillation, and jointly optimizes distillation and cross-lingual contrastive learning. The paper is available on arXiv.
Key facts
- MIMO stands for Multilingual Information Retrieval via Monolingual Objectives
- It is a two-stage framework
- It uses a stable English semantic space from a teacher model
- It addresses limitations of existing embedding models in MLIR
- It initializes cross-lingual alignment through knowledge distillation
- It jointly optimizes distillation and cross-lingual contrastive learning
- The paper is on arXiv with ID 2605.31171
- It aims to improve retrieval discriminability
Entities
Institutions
- arXiv