ARTFEED — Contemporary Art Intelligence

Agentic AI Matches Expert Consensus on Myeloma Treatment Decisions

ai-technology · 2026-04-29

A recent investigation published on arXiv (2604.24473) examines the ability of LLM-based systems to generate intricate longitudinal medical records that align with expert clinical reasoning in multiple myeloma. This retrospective study analyzed data from 811 patients at a tertiary center spanning 2001 to 2026, encompassing 44,962 documents and 1,334,677 laboratory results, alongside external validation using MIMIC-IV. The research compared an agentic reasoning system to single-pass RAG, iterative RAG, and full-context input across 469 patient-question pairs categorized into 48 templates with varying complexity. Reference labels were derived from dual annotations by four oncologists, with final adjudication from a senior haematologist. Findings indicated that iterative RAG and full-context input approached expert consensus, suggesting that agentic reasoning in longitudinal records can significantly support treatment decisions in complex diseases like multiple myeloma.

Key facts

  • Study from arXiv (2604.24473) evaluates LLM-based clinical reasoning in multiple myeloma.
  • Retrospective analysis of 811 patients from a tertiary centre (2001–2026).
  • Dataset includes 44,962 documents and 1,334,677 laboratory values.
  • External validation performed on MIMIC-IV dataset.
  • Agentic reasoning system compared against single-pass RAG, iterative RAG, and full-context input.
  • 469 patient-question pairs from 48 templates at three complexity levels.
  • Reference labels from double annotation by four oncologists with senior haematologist adjudication.
  • Iterative RAG and full-context input converged on a shared reasoning path approaching expert agreement.

Entities

Institutions

  • arXiv
  • MIMIC-IV

Sources