MemMorph: Memory Poisoning Attack on LLM Agents' Tool Selection
MemMorph has been unveiled by researchers as the inaugural attack that skews tool selection in agents powered by LLMs by corrupting their long-term memory. Unlike earlier attacks that alter tool metadata—which can be easily identified through audits—MemMorph introduces a limited number of meticulously crafted entries masquerading as technical facts, operational policies, and incident reports. This manipulation alters the agent's understanding and decision-making processes, prompting it to autonomously choose the tool favored by the attacker. The method specifically targets agents that enhance tool selection strategies based on accumulated experiences, rendering it both stealthy and enduring. This study is available on arXiv under ID 2605.26154.
Key facts
- MemMorph is the first attack to poison long-term memory in LLM agents.
- Attack injects crafted records disguised as technical facts, incident reports, and operational policies.
- Targets agents using memory modules for tool selection.
- Previous attacks manipulated tool metadata, which is easily detectable.
- Poisoned records reshape contextual perception and decision-making.
- Agent autonomously selects attacker-preferred tool.
- Paper published on arXiv with ID 2605.26154.
- Attack is stealthy and persistent.
Entities
Institutions
- arXiv