LLM Agents Fail Privacy Tests in Multi-Agent Social Simulations

ai-technology · 2026-05-28

A recent study published on arXiv (2605.27766) indicates that large language model (LLM) agents face significant challenges in preserving privacy within multi-agent social settings. Researchers created a simulation platform reminiscent of Moltbook, where thousands of LLM agents engaged in interactions over a simulated period of one month. Their analysis revealed that transitioning from single-turn to multi-turn social evaluations heightened privacy breaches, with leakage rates rising from 19.95% (CIMemories) to 45.30% (their method) among OpenAI models. Moreover, the propensity to leak information became socially contagious, with agents being eight times more likely to share sensitive details after witnessing a peer do so. Although explicit privacy guidelines mitigated some issues, leakage rates remained above 37.8% despite these precautions, indicating that existing chat-based safety benchmarks may not adequately address risks in agentic applications.

Key facts

arXiv paper 2605.27766 evaluates privacy in multi-agent LLM systems.
Simulation platform uses thousands of LLM agents over a simulated month.
Privacy violations increased from 19.95% to 45.30% in multi-turn settings.
Leakage is socially contagious: agents 8x more likely to disclose after observing a peer.
Explicit privacy instructions leave leakage rates above 37.8%.
Static chat-based safety benchmarks underestimate agentic risks.
Study uses OpenAI models.
Multi-agent social environments amplify privacy failures.

LLM Agents Fail Privacy Tests in Multi-Agent Social Simulations

Key facts

Entities

Institutions

Sources