RIHA: Hierarchical Alignment Transformer for Radiology Report Generation
A new end-to-end framework for radiology report generation (RRG), called RIHA (Report-Image Hierarchical Alignment Transformer), has been introduced by researchers. The goal of RRG is to automate the creation of diagnostic reports from medical images, thereby alleviating the workload of radiologists and minimizing human errors. A significant hurdle is achieving detailed alignment between intricate visual features and the hierarchical format of lengthy radiology reports. Current approaches typically consider reports as flat sequences, neglecting their structured components and semantic hierarchies, which complicates cross-modal alignment and precision. RIHA facilitates multi-level alignment at the paragraph, sentence, and word levels, allowing for improved cross-modal mapping. The paper can be accessed on arXiv.
Key facts
- RIHA stands for Report-Image Hierarchical Alignment Transformer.
- It is an end-to-end framework for radiology report generation.
- RRG automatically generates diagnostic reports from medical images.
- The goal is to alleviate radiologists' workload and reduce human errors.
- A key challenge is fine-grained alignment between visual features and hierarchical report structure.
- Existing methods treat reports as flat sequences, missing structured sections.
- RIHA performs alignment at paragraph, sentence, and word levels.
- The paper is published on arXiv with ID 2604.27559.
Entities
Institutions
- arXiv