DACLR: Dynamic Contrastive Learning Improves Multimodal Evidence Retrieval

other · 2026-05-28

Dynamic Adaptive Contrastive Learning for Evidence Retrieval (DACLR) is a novel approach designed to enhance multimodal fact-checking by focusing on retrieving evidence that is not only similar but also pertinent to specific claims. Current general multimodal retrieval techniques often yield evidence that is semantically akin yet irrelevant. DACLR initiates the process by utilizing a Multimodal Large Language Model (MLLM) to transform both claims and multimodal evidence into textual form, extracting features at the event level. Following this, a two-stage recall-rerank retrieval method is implemented. By optimizing contrastive loss and identifying challenging negative samples, DACLR significantly improves the model's ability to perceive events during retrieval. The research is available on arXiv with ID 2605.27449.

Key facts

DACLR stands for Dynamic Adaptive Contrastive Learning for Evidence Retrieval
DACLR addresses issues in multimodal fact-checking evidence retrieval
Existing methods retrieve evidence that is similar but not relevant to claims
DACLR uses a Multimodal Large Language Model (MLLM) to convert multimodal evidence and claims into text
DACLR extracts features at event level
DACLR uses a two-stage recall-rerank retrieval method
DACLR optimizes contrastive loss and mines hard negative samples
The paper is published on arXiv with ID 2605.27449

DACLR: Dynamic Contrastive Learning Improves Multimodal Evidence Retrieval

Key facts

Entities

Institutions

Sources