ARTFEED — Contemporary Art Intelligence

MedMemoryBench: Benchmarking AI Memory for Personalized Healthcare Agents

ai-technology · 2026-05-13

MedMemoryBench is an innovative benchmark aimed at assessing memory functions in AI agents tailored for personalized healthcare. It fills a void in current benchmarks that primarily emphasize open-domain dialogues instead of critical medical scenarios. Originating from the needs of a top-tier health management agent that caters to millions of users, MedMemoryBench employs a human-AI collaborative approach to generate authentic, long-term medical pathways using clinically relevant synthetic patient models. The dataset comprises around 2,000 sessions and 16,000 interaction turns. Additionally, it features a streaming assessment protocol that evaluates memory in real-time as the trajectory is developed, moving away from conventional static evaluation methods.

Key facts

  • MedMemoryBench benchmarks agent memory in personalized healthcare
  • Existing benchmarks focus on daily open-domain conversations
  • Motivated by production requirements of a health management agent with tens of millions of users
  • Uses human-AI collaborative pipeline to synthesize medical trajectories
  • Based on clinically grounded synthetic patient archetypes
  • Dataset includes approximately 2,000 sessions and 16,000 interaction turns
  • Introduces 'evaluate-while-constructing' streaming assessment protocol
  • Published on arXiv with ID 2605.11814

Entities

Institutions

  • arXiv

Sources