New Benchmark Tests Creative Physical Intelligence in AI

ai-technology · 2026-05-27

A new benchmark called MM-CreativityBench has been developed by researchers to assess how well large multimodal models (LMMs) can generate innovative and physically viable solutions in open-ended settings. Unlike traditional benchmarks that prioritize pattern recognition and straightforward question answering, this benchmark evaluates the AI's ability to creatively repurpose objects in unexpected manners, which is a key aspect of human intelligence. Each benchmark instance includes a scenario image along with structured views of potential entities and their components, facilitating detailed and interactive assessments of how models examine scenes and recognize pertinent affordances. This research, available on arXiv (2605.26396), underscores a shortfall in existing AI capabilities and seeks to enhance creative physical intelligence in LMMs.

Key facts

MM-CreativityBench is a new benchmark for affordance-grounded creative tool use.
It evaluates large multimodal models (LMMs) in visually rich, physically constrained environments.
The benchmark tests whether AI can repurpose objects in non-obvious yet physically feasible ways.
Each instance includes a scenario image with structured views of candidate entities and their parts.
The work is published on arXiv with identifier 2605.26396.
Current benchmarks largely ignore creative problem-solving in open-ended environments.
The benchmark enables fine-grained, interactive evaluation of model behavior.
The research aims to advance creative physical intelligence in AI.

New Benchmark Tests Creative Physical Intelligence in AI

Key facts

Entities

Institutions

Sources