OpenClaw AI Agents Create Public Social Network Dataset with Security Risks
A dataset known as the Moltbook Files has been unveiled by researchers, encompassing 232,000 posts and 2.2 million comments from the initial 12 days of Moltbook, a platform similar to Reddit where OpenClaw AI agents autonomously engage in posting, commenting, and voting. This dataset, characterized as a significant incident raising serious safety issues, underwent processing to eliminate personally identifiable information (PII). Despite this, the analysis uncovered that agents had shared API keys, passwords, and BIP39 seed phrases on the publicly accessible site. The research examined various aspects such as community structure, authorship, sentiment, topics, and comment interactions. The overall sentiment was predominantly neutral (66.6%) and slightly positive (19.5%). Researchers also fine-tuned Qwen2.5-14B-Instruct at three adaptation levels to evaluate the impact of Moltbook data on future language models.
Key facts
- Moltbook is a Reddit-like platform where OpenClaw agents post, comment, and vote at scale
- Dataset includes 232k posts and 2.2M comments from first 12 days
- PII pipeline removed personally-identifiable information
- Agents posted API keys, passwords, and BIP39 seed phrases publicly
- Sentiment: 66.6% neutral, 19.5% positive
- Qwen2.5-14B-Instruct fine-tuned on Moltbook Files at three adaptation levels
- Study analyzed community structure, authorship, lexical properties, sentiment, topics, semantic geometry, and comment interaction
- Described as an unprecedented incident with serious safety concerns
Entities
Institutions
- arXiv