ARTFEED — Contemporary Art Intelligence

Study Reveals Vulnerabilities in AI Tool-Calling Agents

ai-technology · 2026-06-01

A recent investigation published on arXiv (2605.30686) delves into indirect prompt injection attacks targeting ReAct agents, which integrate chain-of-thought reasoning with tool usage. These agents, utilized for tasks like scheduling, data retrieval, and access, exhibit vulnerabilities when an attacker manipulates a tool's output to insert harmful commands. The study examines three less-explored risk factors: injection depth (the position of the payload in the tool sequence), payload framing (the rhetorical style), and turn-budget sensitivity (the permitted number of turns). Conducting four controlled experiments across 20 scenarios within five attack categories, the research involved 460 trials against GPT-4o-mini and Claude Haiku, costing less than $0.36 in total. Findings from Study 1 indicate that the attack success rate (ASR) for GPT-4o-mini drops from 60% at shallow injection depths to lower levels at deeper ones, underscoring significant security vulnerabilities in existing agent implementations.

Key facts

  • Study examines indirect prompt injection in ReAct agents
  • 20 scenarios across five attack categories tested
  • 460 trials conducted against GPT-4o-mini and Claude Haiku
  • Combined API cost under 0.36 USD
  • Attack success rate decays from 60% with injection depth
  • Three risk dimensions explored: depth, framing, turn-budget
  • Agents used for scheduling, file retrieval, data access
  • Published on arXiv with ID 2605.30686

Entities

Institutions

  • arXiv

Sources