ARTFEED — Contemporary Art Intelligence

MemRouter: Embedding-Based Memory Management for Long-Term Conversational Agents

ai-technology · 2026-05-04

A new system called MemRouter decouples memory admission from autoregressive LLM generation for long-term conversational agents. It uses an embedding-based routing policy to decide which conversation turns to store in external memory, replacing per-turn memory-management decoding. MemRouter encodes each turn with recent context, projects embeddings through a frozen LLM backbone, and predicts storage using lightweight classification heads, training only 12 million parameters. In a controlled comparison on the LoCoMo benchmark, with identical retrieval pipeline, answer prompts, and QA backbone (Qwen2.5-7B), MemRouter outperformed an LLM-based memory manager on every question category, achieving an overall F1 score of 52.0 versus 45.6, with non-overlapping 95% confidence intervals. It also reduced memory-management p50 latency.

Key facts

  • MemRouter is a write-side memory router for long-term conversational agents.
  • It decouples memory admission from autoregressive LLM generation.
  • It uses an embedding-based routing policy instead of per-turn decoding.
  • Only 12 million parameters are trained.
  • Tested on the LoCoMo benchmark with Qwen2.5-7B as the QA backbone.
  • Overall F1 score: 52.0 vs 45.6 for LLM-based manager.
  • Non-overlapping 95% confidence intervals.
  • Reduced memory-management p50 latency.

Entities

Institutions

  • arXiv

Sources