ARTFEED — Contemporary Art Intelligence

MirrorBench: A New Benchmark for Self-Centric Intelligence in MLLMs

ai-technology · 2026-04-24

A new benchmark called MirrorBench has been developed by researchers to assess self-centric intelligence in multimodal large language models (MLLMs) through simulation. Drawing inspiration from the psychological Mirror Self-Recognition (MSR) test, MirrorBench employs a structured framework that includes increasingly complex tasks, ranging from basic visual perception to advanced self-representation. Tests conducted on prominent MLLMs indicate that their performance, even at the most fundamental level, falls significantly short of human capabilities, underscoring critical shortcomings in self-awareness. This benchmark seeks to address a gap in existing evaluations, which predominantly concentrate on interactions with external objects. The findings are available on arXiv with the identifier 2604.14785.

Key facts

  • MirrorBench is a simulation-based benchmark for MLLMs.
  • It is inspired by the Mirror Self-Recognition (MSR) test in psychology.
  • The benchmark uses a tiered framework of progressively challenging tasks.
  • Tasks range from basic visual perception to high-level self-representation.
  • Experiments show MLLMs perform substantially worse than humans even at the lowest level.
  • The benchmark addresses a lack of systematic evaluation of self-centric intelligence.
  • Current benchmarks mainly target perception and interaction with external objects.
  • The study is published on arXiv with identifier 2604.14785.

Entities

Institutions

  • arXiv

Sources