IntentScore: AI Reward Model Boosts Computer-Use Agents by 6.9%

ai-technology · 2026-05-25

A new reward model called IntentScore has been created by researchers to assess the quality of actions taken by Computer-Use Agents (CUAs). These agents, which utilize large language models, execute GUI tasks in desktop settings but frequently encounter irreversible mistakes. IntentScore is trained on 398,000 offline GUI interaction steps from three different operating systems, employing contrastive alignment and margin ranking to evaluate potential actions. By incorporating planning intent into the action encoder, it achieves a pairwise discrimination accuracy of 97.5% on a separate evaluation set. When implemented as a re-ranker for Agent S3 in the unseen OSWorld environment, IntentScore enhanced the task success rate by 6.9%. The findings are published in arXiv:2604.05157.

Key facts

IntentScore is a plan-aware reward model for Computer-Use Agents
Trained on 398K offline GUI interaction steps across three operating systems
Uses contrastive alignment and margin ranking objectives
Achieves 97.5% pairwise discrimination accuracy
Deployed as re-ranker for Agent S3 on OSWorld
Improves task success rate by 6.9%
Described in arXiv:2604.05157

IntentScore: AI Reward Model Boosts Computer-Use Agents by 6.9%

Key facts

Entities

Institutions

Sources