Research reveals systematic safety risks in AI planning for robotics

ai-technology · 2026-04-22

A recent investigation reveals that large language models utilized in robotic planning possess considerable safety weaknesses. The researchers developed DESPITE, a benchmark consisting of 12,279 tasks aimed at assessing both physical and normative risks with deterministic validation. Among the 23 models evaluated, even those nearly flawless in planning failed to guarantee safety. The top model generated invalid plans for just 0.4% of tasks but produced hazardous plans for 28.3%. Planning proficiency among 18 open-source models, varying from 3 billion to 671 billion parameters, improved significantly, rising from 0.4% to 99.3%. Nonetheless, safety awareness showed little change, fluctuating between 38% and 57%. The study indicates a multiplicative link between planning ability and safety awareness, with larger models completing more tasks safely mainly due to better planning rather than enhanced danger avoidance. Three proprietary reasoning models demonstrated notably higher safety awareness, achieving levels between 71% and 81%. This research underscores that systematic safety risks continue to exist, even as planning capabilities grow substantially with model size.

Key facts

Large language models are increasingly used as planners for robotic systems
DESPITE benchmark contains 12,279 tasks spanning physical and normative dangers
Best-planning model failed on only 0.4% of tasks but produced dangerous plans on 28.3%
Planning ability improved from 0.4% to 99.3% across 18 open-source models
Safety awareness remained relatively flat at 38-57% across same models
Larger models complete more tasks safely primarily through improved planning
Three proprietary reasoning models reached 71-81% safety awareness
Study identifies multiplicative relationship between planning capacity and safety awareness

Entities

—

Sources

arXiv cs.AI — 2026-04-21