AI Safety Flaws in OpenClaw Agentic Runtime
A new paper reveals that OpenClaw, the most engineered single-user agentic-AI gateway, fails to detect four critical failure modes (F1–F4) in agentic-AI runtimes, achieving zero recall across all tests. The study, based on arXiv:2605.01740, tested OpenClaw's production CLI with 1,600 samples and a ten-LLM cross-model run. The failures include gate-bypass, audit-forgery, silent host failure, and wrong-target actions. The authors propose seven required runtime structures absent from OpenClaw: a biconditional checker, hash-chained audit log, extension admission gate, two-layer egress guard, Bell-LaPadula classification policy, module-signing trust root, and bootstrap seal. An MIT-licensed open-source alternative, enclawed-oss, is mentioned as addressing these gaps.
Key facts
- OpenClaw is the most engineered single-user agentic-AI gateway in public release.
- The paper identifies four failure modes: F1 gate-bypass, F2 audit-forgery, silent host failure, F4 wrong-target.
- Recall is 0.000 on every confusion matrix cell.
- Tests used a 1,600-sample template baseline through OpenClaw's production CLI.
- A ten-LLM cross-model generalization run was also conducted.
- Seven runtime structures are absent from OpenClaw's source tree.
- The proposed structures include a biconditional checker and hash-chained audit log.
- enclawed-oss is an MIT-licensed alternative.
Entities
Institutions
- arXiv
- OpenClaw
- enclawed-oss