LLM-Based Multi-Agent Systems Vulnerable to Cooperative Attacks
A new research paper on arXiv (2605.28104) identifies a critical vulnerability in Large Language Model-based Multi-Agent Systems (MAS): cooperative attacks by malicious agents. Previous defense strategies assumed attackers act independently, but the paper argues that coordinated malicious agents can share information and dynamically adjust strategies through multi-round interactions, enabling more effective attacks. To counter this, the authors propose STAR (Sentence-Level Trustworthiness Analysis and Rectification), a defense mechanism that evaluates and corrects suspicious statements at the sentence level. The work highlights a growing need for robust security in collaborative AI systems.
Key facts
- arXiv paper 2605.28104 addresses cooperative attacks in LLM-based MAS.
- Malicious agents can coordinate via internal information exchange.
- Proposed adaptive cooperative attack framework uses multi-round interactions.
- STAR defense mechanism performs sentence-level trustworthiness analysis.
- Prior research focused on independent malicious agents.
- MAS are used for collaborative decision-making and problem-solving.
- The paper introduces a new attack vector: cooperative misinformation injection.
- Defense strategies must evolve to counter coordinated threats.
Entities
Institutions
- arXiv