CrossCult-KIBench: Benchmark for Cross-Cultural Knowledge in MLLMs
The introduction of CrossCult-KIBench establishes a new standard for assessing the integration of cross-cultural knowledge in Multimodal Large Language Models (MLLMs). This benchmark tackles the problem of MLLMs, which are predominantly trained on English data, generating culturally insensitive outputs. It features 9,800 image-based examples that span 49 culturally significant visual contexts within English, Chinese, and Arabic cultures. Evaluation can be conducted in both single-insert and sequential-insert formats. Additionally, a baseline approach known as Memory-Conditioned Knowledge Insertion (MCKI) has been suggested. This research is available on arXiv with the identifier 2605.06115.
Key facts
- CrossCult-KIBench is a benchmark for cross-cultural knowledge insertion in MLLMs.
- It includes 9,800 image-grounded cases.
- Covers 49 culturally relevant visual scenarios.
- Supports English, Chinese, and Arabic language-culture groups.
- Evaluates single-insert and sequential-insert settings.
- Proposes MCKI as a baseline method.
- Published on arXiv with ID 2605.06115.
- Addresses cultural misalignment in MLLMs.
Entities
Institutions
- arXiv