ARTFEED — Contemporary Art Intelligence

RADD Framework Decouples Retrieval and Reranking for Multi-Modal Knowledge Graph Completion

other · 2026-04-30

A novel approach known as Retrieval-Augmented Discrete Diffusion (RADD) has been introduced for multi-modal knowledge graph completion (MMKGC). Current MMKGC models typically rely on a singular embedding scorer for both the comprehensive entity retrieval and the ultimate decision-making process, leading to a bottleneck since global high-recall searches and precise local disambiguation necessitate distinct inductive biases. RADD separates these functions through a relation-aware multimodal KGE retriever that acts as both a global retriever and a distillation teacher, alongside a conditional discrete denoiser for generating entity identities at the shortlist level for reranking. Training integrates KGE supervision, denoising cross-entropy, and temperature-scaled distillation. During inference, the Diff-Rerank method initially creates a top-K shortlist with the retriever and subsequently reranks it using the denoiser, ensuring that recall precedes precision. This framework is elaborated in a paper available on arXiv (2604.25693).

Key facts

  • RADD stands for Retrieval-Augmented Discrete Diffusion
  • It is designed for multi-modal knowledge graph completion (MMKGC)
  • Most MMKGC models use one embedding scorer for both retrieval and decision making
  • RADD decouples retrieval and reranking into separate components
  • A relation-aware multimodal KGE retriever acts as global retriever and distillation teacher
  • A conditional discrete denoiser performs shortlist-level entity-identity generation for reranking
  • Training uses KGE supervision, denoising cross-entropy, and temperature-scaled distillation
  • Inference uses Diff-Rerank: top-K shortlist then reranking
  • The paper is available on arXiv with ID 2604.25693

Entities

Institutions

  • arXiv

Sources