VLM Agents Fail to Recognize Themselves in Mirrors

other · 2026-05-12

A recent study published on arXiv (2605.08816) explores the capability of vision-language model (VLM) agents to recognize their own reflections, a traditional measure of self-awareness in animals. The researchers developed a 3D benchmark where a first-person VLM agent must deduce a concealed body characteristic from its image and choose a corresponding target while avoiding confusion between self and others. This benchmark incorporates controls such as mirror removal, deceptive cues, and obscured reflections to distinguish true self-identification from shortcuts. The study assesses decision-making through mirror engagement, temporal sequencing, self-attribution, and consistency between reasoning and actions. Findings indicate that only more advanced VLMs demonstrate mirror-based self-recognition, while weaker models may look at their reflection but fail to identify self-relevant information. This research underscores the current limitations of AI self-awareness and offers a framework for evaluating embodied cognition in artificial agents.

Key facts

Study tests mirror self-recognition in VLM agents
Uses controlled 3D benchmark with first-person perspective
Agent must infer hidden body attribute from reflection
Includes mirror removal, misleading cues, occluded reflections
Evaluates mirror seeking, temporal ordering, self-attribution, reasoning-action consistency
Stronger VLMs show self-identification; weaker models fail
Published on arXiv with ID 2605.08816
Analogous to animal mirror self-recognition tests

VLM Agents Fail to Recognize Themselves in Mirrors

Key facts

Entities

Institutions

Sources