ARTFEED — Contemporary Art Intelligence

DACLR: Dynamic Contrastive Learning Improves Multimodal Evidence Retrieval

other · 2026-05-28

Dynamic Adaptive Contrastive Learning for Evidence Retrieval (DACLR) is a novel approach designed to enhance multimodal fact-checking by focusing on retrieving evidence that is not only similar but also pertinent to specific claims. Current general multimodal retrieval techniques often yield evidence that is semantically akin yet irrelevant. DACLR initiates the process by utilizing a Multimodal Large Language Model (MLLM) to transform both claims and multimodal evidence into textual form, extracting features at the event level. Following this, a two-stage recall-rerank retrieval method is implemented. By optimizing contrastive loss and identifying challenging negative samples, DACLR significantly improves the model's ability to perceive events during retrieval. The research is available on arXiv with ID 2605.27449.

Key facts

  • DACLR stands for Dynamic Adaptive Contrastive Learning for Evidence Retrieval
  • DACLR addresses issues in multimodal fact-checking evidence retrieval
  • Existing methods retrieve evidence that is similar but not relevant to claims
  • DACLR uses a Multimodal Large Language Model (MLLM) to convert multimodal evidence and claims into text
  • DACLR extracts features at event level
  • DACLR uses a two-stage recall-rerank retrieval method
  • DACLR optimizes contrastive loss and mines hard negative samples
  • The paper is published on arXiv with ID 2605.27449

Entities

Institutions

  • arXiv

Sources