Research reveals systematic safety risks in AI planning for robotics
A recent investigation reveals that large language models utilized in robotic planning possess considerable safety weaknesses. The researchers developed DESPITE, a benchmark consisting of 12,279 tasks aimed at assessing both physical and normative risks with deterministic validation. Among the 23 models evaluated, even those nearly flawless in planning failed to guarantee safety. The top model generated invalid plans for just 0.4% of tasks but produced hazardous plans for 28.3%. Planning proficiency among 18 open-source models, varying from 3 billion to 671 billion parameters, improved significantly, rising from 0.4% to 99.3%. Nonetheless, safety awareness showed little change, fluctuating between 38% and 57%. The study indicates a multiplicative link between planning ability and safety awareness, with larger models completing more tasks safely mainly due to better planning rather than enhanced danger avoidance. Three proprietary reasoning models demonstrated notably higher safety awareness, achieving levels between 71% and 81%. This research underscores that systematic safety risks continue to exist, even as planning capabilities grow substantially with model size.
Key facts
- Large language models are increasingly used as planners for robotic systems
- DESPITE benchmark contains 12,279 tasks spanning physical and normative dangers
- Best-planning model failed on only 0.4% of tasks but produced dangerous plans on 28.3%
- Planning ability improved from 0.4% to 99.3% across 18 open-source models
- Safety awareness remained relatively flat at 38-57% across same models
- Larger models complete more tasks safely primarily through improved planning
- Three proprietary reasoning models reached 71-81% safety awareness
- Study identifies multiplicative relationship between planning capacity and safety awareness
Entities
—