Termination Poisoning Attacks Exploit LLM Agent Loops

ai-technology · 2026-05-09

A new arXiv paper (2605.05846) identifies a critical vulnerability in LLM agents that operate in iterative execution loops. Researchers define 'Termination Poisoning' as an attack where malicious prompts distort an agent's self-evaluation, causing it to believe a task is incomplete and leading to unbounded computation. The study designs 10 representative attack strategies and tests them across 8 LLM agents and 60 tasks. Results show distinct behavioral signatures in different agents that determine attack success, offering transferable patterns for crafting attacks against unseen agents. The work highlights a systemic risk in autonomous agent architectures.

Key facts

arXiv paper 2605.05846 defines Termination Poisoning attacks on LLM agents
Attacks exploit iterative execution loops where agents reason, act, and self-evaluate
Malicious prompts can distort termination judgment, causing unbounded computation
10 representative attack strategies were designed
Empirical study covered 8 LLM agents and 60 tasks
Different agents exhibit distinct behavioral signatures affecting attack success
Transferable patterns can guide attacks on unseen agents
The vulnerability is inherent to self-directed loop architectures

Termination Poisoning Attacks Exploit LLM Agent Loops

Key facts

Entities

Institutions

Sources