ARTFEED — Contemporary Art Intelligence

DeferMem: Reinforcement Learning for Long-Term Memory QA

ai-technology · 2026-05-23

DeferMem is a long-term memory framework for LLM agents that decouples memory processing into high-recall candidate retrieval and query-conditioned evidence distillation. It uses a lightweight segment-link structure to organize raw conversational history and retrieve broad candidates at query time. A memory distiller trained with DistillPO, a reinforcement learning algorithm, distills high-recall but noisy candidates into query-specific evidence. This approach addresses the challenge of scattered evidence across long histories and irrelevant content, improving answer accuracy without pre-processing memory before queries are known.

Key facts

  • DeferMem is a long-term memory framework for LLM agents.
  • It decouples memory into high-recall candidate retrieval and query-conditioned evidence distillation.
  • Uses a lightweight segment-link structure to organize raw history.
  • Retrieves broad candidates at query time.
  • Applies a memory distiller trained with DistillPO reinforcement learning algorithm.
  • DistillPO distills high-recall but noisy candidates into query-specific evidence.
  • Addresses scattered evidence across long conversational histories.
  • Improves answer accuracy without pre-processing memory before queries.

Entities

Sources