RecMem: Efficient Memory Consolidation for Long-Running LLM Agents
RecMem is a novel memory system designed for long-running LLM agents that reduces token consumption by rethinking when memory consolidation occurs. Traditional systems invoke LLMs for every incoming interaction, leading to high costs. RecMem stores interactions in a subconscious layer using lightweight embeddings for retrieval, only activating LLMs when sustained recurrence of semantically similar interactions is detected. This recurrence-based approach ensures extraction only for rich semantic clusters, improving efficiency without sacrificing accuracy. The system addresses the limited context windows of LLMs by organizing user-agent interactions into retrievable external memory.
Key facts
- RecMem stands for Recurrence-based Memory Consolidation.
- It targets long-running LLM agents.
- Existing memory systems use eager consolidation, invoking LLMs for every interaction.
- RecMem uses a subconscious memory layer with lightweight embeddings.
- LLMs are invoked only when sustained recurrence of semantically similar interactions is observed.
- Recurrence-based consolidation extracts episodic and semantic memory.
- The approach reduces token consumption.
- The paper is from arXiv with ID 2605.16045.
Entities
Institutions
- arXiv