ARTFEED — Contemporary Art Intelligence

Co-Evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding

other · 2026-04-25

A new reinforcement learning framework, Propose-then-Critic, co-evolves a proposer and a visual critic to improve GUI grounding—mapping natural language instructions to precise pixel coordinates. The approach addresses the challenge of visually homogeneous elements and dense layouts by replacing static self-consistency strategies with a learnable selection mechanism that critiques proposals rendered on screenshots. The maturity-aware adaptive co-evolutionary reinforcement learning jointly optimizes both components, overcoming the disparity between grounding and critiquing capabilities. The paper is available on arXiv under reference 2604.21268.

Key facts

  • arXiv paper 2604.21268 proposes Propose-then-Critic framework for GUI grounding.
  • Framework co-evolves a proposer and a visual critic via reinforcement learning.
  • Replaces static self-consistency strategies with a learnable selection mechanism.
  • Addresses visually homogeneous elements and dense layouts in GUI grounding.
  • Uses maturity-aware adaptive co-evolutionary reinforcement learning.
  • Critiques proposals rendered on screenshots to select optimal target.
  • Overcomes disparity between grounding and critiquing capabilities.
  • Published on arXiv with announcement type cross.

Entities

Institutions

  • arXiv

Sources