ARTFEED — Contemporary Art Intelligence

LUNGUAGE: Benchmark for Structured Chest X-ray Interpretation

other · 2026-04-30

A new benchmark dataset named LUNGUAGE has been launched by researchers to facilitate structured radiology report generation, allowing for both individual report assessment and patient-level evaluations over time across various studies. This dataset includes 1,473 chest X-ray reports that have been annotated and examined by specialists, with 186 featuring longitudinal annotations to track disease evolution and intervals between studies. To create detailed, schema-aligned structured reports, a two-stage structuring framework is employed for the generated reports. Additionally, the researchers have introduced LUNGUAGESCORE, a clear metric designed for evaluation purposes.

Key facts

  • LUNGUAGE is a benchmark dataset for structured radiology report generation.
  • It supports single-report evaluation and longitudinal patient-level assessment.
  • Contains 1,473 annotated chest X-ray reports reviewed by experts.
  • 186 reports have longitudinal annotations for disease progression.
  • A two-stage structuring framework transforms reports into structured formats.
  • LUNGUAGESCORE is an interpretable metric for evaluation.
  • The dataset addresses limitations of existing coarse metrics.
  • Focuses on fine-grained clinical semantics and temporal dependencies.

Entities

Institutions

  • arXiv

Sources