Memory Inception: Steering LLMs via Latent KV Cache Manipulation
The paper arXiv:2605.06225v1 presents a novel approach called memory inception (MI), which allows for the manipulation of large language models (LLMs) without the need for training. This technique involves the strategic insertion of text-based key-value (KV) banks at specific layers within the latent attention space. Unlike traditional instruction prompting, which can overwhelm interactions by storing guidance tokens at every layer, or activation steering, which is more efficient but generally less effective, MI focuses on selective KV distribution. When applied to personality-steering tasks, MI demonstrates superior control-drift balance, competing effectively with prompting and consistently surpassing CAA. Additionally, MI facilitates behavior changes mid-conversation without altering the visible transcript and does not require extensive training or structured reminders.
Key facts
- Memory inception (MI) is a training-free method for steering LLMs.
- MI inserts text-derived KV banks only at selected layers of latent attention space.
- MI avoids caching guidance tokens at every layer like prompting does.
- MI outperforms activation steering and CAA on personality-steering tasks.
- MI supports mid-conversation behavior shifts without rewriting the transcript.
- The method does not require additional training or large structured reminders.
- MI achieves the best overall control-drift trade-off compared to prompting and CAA.
- The paper is available on arXiv with ID 2605.06225.
Entities
Institutions
- arXiv