ARTFEED — Contemporary Art Intelligence

Brain Alignment of Vision-Language and Action Models During Gameplay

ai-technology · 2026-05-20

A study published on arXiv (2605.19352) investigates how vision-language models (VLMs) and large-action models (LAMs) align with human brain activity during naturalistic gameplay. Using fMRI recordings from participants playing Atari-style video games, researchers examined how action-focused and reasoning-focused prompts shape internal representations. Both VLMs and LAMs showed significant alignment with brain activity, bridging a gap in interactive brain-encoding studies that previously focused on passive tasks or reinforcement-learning agents.

Key facts

  • Study published on arXiv with ID 2605.19352
  • Uses fMRI recordings from participants playing Atari-style video games
  • Compares vision-language models (VLMs) and large-action models (LAMs)
  • Examines action-focused and reasoning-focused prompts
  • Both model families exhibit significant brain alignment
  • Addresses gap in interactive brain-encoding studies
  • Previous studies limited to reinforcement-learning agents or passive tasks
  • Research at intersection of neuroscience and machine learning

Entities

Institutions

  • arXiv

Sources