AI Safety Flaws in OpenClaw Agentic Runtime

ai-technology · 2026-05-06

A new paper reveals that OpenClaw, the most engineered single-user agentic-AI gateway, fails to detect four critical failure modes (F1–F4) in agentic-AI runtimes, achieving zero recall across all tests. The study, based on arXiv:2605.01740, tested OpenClaw's production CLI with 1,600 samples and a ten-LLM cross-model run. The failures include gate-bypass, audit-forgery, silent host failure, and wrong-target actions. The authors propose seven required runtime structures absent from OpenClaw: a biconditional checker, hash-chained audit log, extension admission gate, two-layer egress guard, Bell-LaPadula classification policy, module-signing trust root, and bootstrap seal. An MIT-licensed open-source alternative, enclawed-oss, is mentioned as addressing these gaps.

Key facts

OpenClaw is the most engineered single-user agentic-AI gateway in public release.
The paper identifies four failure modes: F1 gate-bypass, F2 audit-forgery, silent host failure, F4 wrong-target.
Recall is 0.000 on every confusion matrix cell.
Tests used a 1,600-sample template baseline through OpenClaw's production CLI.
A ten-LLM cross-model generalization run was also conducted.
Seven runtime structures are absent from OpenClaw's source tree.
The proposed structures include a biconditional checker and hash-chained audit log.
enclawed-oss is an MIT-licensed alternative.

AI Safety Flaws in OpenClaw Agentic Runtime

Key facts

Entities

Institutions

Sources