ARTFEED — Contemporary Art Intelligence

MEMAUDIT: A New Protocol for Evaluating LLM Memory Writing

ai-technology · 2026-05-06

MEMAUDIT has been unveiled by researchers as a precise evaluation protocol for budgeted long-term memory writing in LLM agents. This protocol establishes a fixed experience stream, identifies candidate memory representations, assesses storage costs, and defines semantic evidence units, future-query requirements, and a budget. It transforms memory selection during writing into a finite, auditable optimization challenge with a certified denominator. Utilizing a concave-over-modular semantic coverage goal under constraints of storage and one representation per experience, it calculates exact package optima through branch-and-bound with MILP certification. This development addresses the limitation of current evaluations that conflate memory writing with retrieval, prompting, and reasoning. The protocol has undergone testing with controlled exact packages, rigorous validity stress tests, and human-au.

Key facts

  • MEMAUDIT is an exact package-oracle evaluation protocol for budgeted long-term LLM memory writing.
  • It fixes an experience stream, candidate memory representations, storage costs, semantic evidence units, future-query requirements, and a budget.
  • The protocol turns write-time memory selection into a finite auditable optimization problem with a certified denominator.
  • It uses a concave-over-modular semantic coverage objective under storage and one-representation-per-experience constraints.
  • Exact package optima are computed using branch-and-bound with MILP certification.
  • Existing evaluations usually measure final question-answering accuracy, which entangles memory writing with retrieval, prompting, and reader reasoning.
  • The protocol has been tested across controlled exact packages, validity-heavy stress tests, and human-au.
  • The paper is available on arXiv under ID 2605.02199.

Entities

Institutions

  • arXiv

Sources