ARTFEED — Contemporary Art Intelligence

Termination Poisoning Attacks Exploit LLM Agent Loops

ai-technology · 2026-05-09

A new arXiv paper (2605.05846) identifies a critical vulnerability in LLM agents that operate in iterative execution loops. Researchers define 'Termination Poisoning' as an attack where malicious prompts distort an agent's self-evaluation, causing it to believe a task is incomplete and leading to unbounded computation. The study designs 10 representative attack strategies and tests them across 8 LLM agents and 60 tasks. Results show distinct behavioral signatures in different agents that determine attack success, offering transferable patterns for crafting attacks against unseen agents. The work highlights a systemic risk in autonomous agent architectures.

Key facts

  • arXiv paper 2605.05846 defines Termination Poisoning attacks on LLM agents
  • Attacks exploit iterative execution loops where agents reason, act, and self-evaluate
  • Malicious prompts can distort termination judgment, causing unbounded computation
  • 10 representative attack strategies were designed
  • Empirical study covered 8 LLM agents and 60 tasks
  • Different agents exhibit distinct behavioral signatures affecting attack success
  • Transferable patterns can guide attacks on unseen agents
  • The vulnerability is inherent to self-directed loop architectures

Entities

Institutions

  • arXiv

Sources