New Framework Models LLM Memory as Markov Matrix for Knowledge Expansion
A new research paper proposes a framework that models autoregressive language generation in large language models as a Markov process over tokens, with model memory represented by a Markov transition matrix. This approach aims to address catastrophic forgetting during continual knowledge incorporation by extending the state space for new knowledge while preserving existing transitions to retain previously learned information. The paper argues that large-scale weight updates may be unnecessary for acquiring small amounts of new knowledge, offering a more sample-efficient alternative to parameter-update algorithms. The work is published on arXiv under identifier 2605.04308.
Key facts
- Paper proposes modeling LLM memory as a Markov transition matrix.
- Addresses catastrophic forgetting in continual knowledge incorporation.
- New knowledge corresponds to extending the state space.
- Existing transitions are preserved to retain learned knowledge.
- Argues large-scale weight updates may be unnecessary for small knowledge additions.
- Published on arXiv with ID 2605.04308.
- Framework is principled and sample-efficient.
- Targets long-term evolution of large language models.
Entities
Institutions
- arXiv