FundusGround: A Benchmark for Interpretable Ophthalmic VQA
FundusGround introduces a novel standard for clinically interpretable ophthalmic Visual Question Answering (VQA), focusing on spatially-grounded evidence of lesions. It employs a three-step process to gather 10,719 fundus images, which feature 15,595 carefully annotated lesions, all accurately positioned according to the Early Treatment Diabetic Retinopathy Study (ETDRS) grid to ensure anatomical consistency. This approach allows for standardized mapping to nine significant retinal areas. From this organized data, 72,706 questions are formulated in four different formats, enhancing the interpretability of AI applications in ophthalmology.
Key facts
- FundusGround is a benchmark for clinically interpretable ophthalmic VQA
- It uses spatially-grounded lesion evidence
- 10,719 fundus images with 15,595 annotated lesions
- Lesions localized using ETDRS grid
- Mapping to nine retinal regions
- 72,706 questions generated in four formats
- Focuses on interpretability over answer accuracy
- Aims to support clinical decision-making
Entities
Institutions
- arXiv