RADD Framework Decouples Retrieval and Reranking for Multi-Modal Knowledge Graph Completion
A novel approach known as Retrieval-Augmented Discrete Diffusion (RADD) has been introduced for multi-modal knowledge graph completion (MMKGC). Current MMKGC models typically rely on a singular embedding scorer for both the comprehensive entity retrieval and the ultimate decision-making process, leading to a bottleneck since global high-recall searches and precise local disambiguation necessitate distinct inductive biases. RADD separates these functions through a relation-aware multimodal KGE retriever that acts as both a global retriever and a distillation teacher, alongside a conditional discrete denoiser for generating entity identities at the shortlist level for reranking. Training integrates KGE supervision, denoising cross-entropy, and temperature-scaled distillation. During inference, the Diff-Rerank method initially creates a top-K shortlist with the retriever and subsequently reranks it using the denoiser, ensuring that recall precedes precision. This framework is elaborated in a paper available on arXiv (2604.25693).
Key facts
- RADD stands for Retrieval-Augmented Discrete Diffusion
- It is designed for multi-modal knowledge graph completion (MMKGC)
- Most MMKGC models use one embedding scorer for both retrieval and decision making
- RADD decouples retrieval and reranking into separate components
- A relation-aware multimodal KGE retriever acts as global retriever and distillation teacher
- A conditional discrete denoiser performs shortlist-level entity-identity generation for reranking
- Training uses KGE supervision, denoising cross-entropy, and temperature-scaled distillation
- Inference uses Diff-Rerank: top-K shortlist then reranking
- The paper is available on arXiv with ID 2604.25693
Entities
Institutions
- arXiv