AnchorDiff: Masked Diffusion Model for Radiology Report Generation
AnchorDiff is a novel masked-diffusion framework for radiology report generation (RRG) that integrates knowledge-graph-derived clinical anchors into diffusion language modeling. Unlike traditional autoregressive models that generate text left-to-right and suffer from sequence bias, AnchorDiff leverages bidirectional context and iterative refinement to better ground reports in image-specific evidence. The framework introduces a topology-aware training strategy using RadGraph-derived entity hierarchies to assign clinically important tokens. This approach aims to mitigate limitations of fixed-order autoregressive decoding and reduce reliance on high-frequency report templates.
Key facts
- AnchorDiff is a masked-diffusion framework for RRG
- It integrates knowledge-graph-derived clinical anchors
- Uses bidirectional context and iterative refinement
- Introduces topology-aware training with RadGraph entity hierarchies
- Aims to mitigate sequence bias in autoregressive models
- Published on arXiv with ID 2605.17071
Entities
Institutions
- arXiv