ARTFEED — Contemporary Art Intelligence

MIRAGE: Prompt Injection Attack on Mobile GUI Agents via User Content

ai-technology · 2026-05-28

A team of researchers has introduced MIRAGE (Mobile Injection of Realistic Adversarial GUI Examples), a system that targets vision-language model (VLM)-driven mobile GUI agents by embedding adversarial text into user-generated content areas of screenshots. This method does not alter the agent, application, or operating system. MIRAGE functions through three phases: a Localizer pinpoints user-controllable areas, a Generator creates contextually relevant payloads in the app's native design, and a Curator ensures realism while balancing samples across various applications, region types, and attack objectives. A significant challenge lies in making the injected screenshots visually indistinguishable from legitimate ones. This research reveals a serious vulnerability in VLM-based agents, which struggle to differentiate between trusted interface elements and user-generated content. The paper can be found on arXiv under ID 2605.28116.

Key facts

  • MIRAGE stands for Mobile Injection of Realistic Adversarial GUI Examples.
  • The attack targets VLM-driven mobile GUI agents.
  • It places attacker-controlled text into user-generated content regions.
  • The pipeline has three stages: Localizer, Generator, Curator.
  • No modification to agent, application, or OS is required.
  • Injected screenshots must remain visually indistinguishable from benign ones.
  • The paper is on arXiv with ID 2605.28116.
  • The attack exploits the inability of VLMs to separate trusted UI from user content.

Entities

Institutions

  • arXiv

Sources