LLM-Based Multi-Agent Systems Vulnerable to Cooperative Attacks

ai-technology · 2026-05-28

A new research paper on arXiv (2605.28104) identifies a critical vulnerability in Large Language Model-based Multi-Agent Systems (MAS): cooperative attacks by malicious agents. Previous defense strategies assumed attackers act independently, but the paper argues that coordinated malicious agents can share information and dynamically adjust strategies through multi-round interactions, enabling more effective attacks. To counter this, the authors propose STAR (Sentence-Level Trustworthiness Analysis and Rectification), a defense mechanism that evaluates and corrects suspicious statements at the sentence level. The work highlights a growing need for robust security in collaborative AI systems.

Key facts

arXiv paper 2605.28104 addresses cooperative attacks in LLM-based MAS.
Malicious agents can coordinate via internal information exchange.
Proposed adaptive cooperative attack framework uses multi-round interactions.
STAR defense mechanism performs sentence-level trustworthiness analysis.
Prior research focused on independent malicious agents.
MAS are used for collaborative decision-making and problem-solving.
The paper introduces a new attack vector: cooperative misinformation injection.
Defense strategies must evolve to counter coordinated threats.

LLM-Based Multi-Agent Systems Vulnerable to Cooperative Attacks

Key facts

Entities

Institutions

Sources