VLM Agents Fail to Recognize Themselves in Mirrors
A recent study published on arXiv (2605.08816) explores the capability of vision-language model (VLM) agents to recognize their own reflections, a traditional measure of self-awareness in animals. The researchers developed a 3D benchmark where a first-person VLM agent must deduce a concealed body characteristic from its image and choose a corresponding target while avoiding confusion between self and others. This benchmark incorporates controls such as mirror removal, deceptive cues, and obscured reflections to distinguish true self-identification from shortcuts. The study assesses decision-making through mirror engagement, temporal sequencing, self-attribution, and consistency between reasoning and actions. Findings indicate that only more advanced VLMs demonstrate mirror-based self-recognition, while weaker models may look at their reflection but fail to identify self-relevant information. This research underscores the current limitations of AI self-awareness and offers a framework for evaluating embodied cognition in artificial agents.
Key facts
- Study tests mirror self-recognition in VLM agents
- Uses controlled 3D benchmark with first-person perspective
- Agent must infer hidden body attribute from reflection
- Includes mirror removal, misleading cues, occluded reflections
- Evaluates mirror seeking, temporal ordering, self-attribution, reasoning-action consistency
- Stronger VLMs show self-identification; weaker models fail
- Published on arXiv with ID 2605.08816
- Analogous to animal mirror self-recognition tests
Entities
Institutions
- arXiv