ARTFEED — Contemporary Art Intelligence

New Attack Exposes Stateless Multi-Turn Vulnerabilities in LLMs

ai-technology · 2026-04-25

A new multi-turn attack method called Transient Turn Injection (TTI) has been developed by researchers, which takes advantage of stateless moderation in large language models by spreading adversarial intent over separate interactions. Unlike traditional jailbreak techniques that depend on a continuous conversational context, TTI employs automated attacker agents driven by LLMs to systematically test and bypass policy enforcement. An assessment of leading models from OpenAI, Anthropic, Google Gemini, Meta, and notable open-source alternatives showed marked differences in their resilience, with only a few architectures demonstrating significant inherent robustness. The research paper can be found on arXiv with the reference number 2604.21860.

Key facts

  • TTI is a new multi-turn attack technique for LLMs.
  • It exploits stateless moderation by distributing adversarial intent across isolated interactions.
  • Automated attacker agents powered by LLMs are used to iteratively test and evade policy enforcement.
  • Evaluation covered models from OpenAI, Anthropic, Google Gemini, Meta, and open-source alternatives.
  • Only select architectures showed substantial inherent robustness against TTI.
  • The paper is published on arXiv with ID 2604.21860.

Entities

Institutions

  • OpenAI
  • Anthropic
  • Google Gemini
  • Meta

Sources