ARTFEED — Contemporary Art Intelligence

ToolCUA: Optimal GUI-Tool Path Orchestration for Computer Use Agents

ai-technology · 2026-05-13

A recent publication on arXiv presents ToolCUA, a comprehensive agent designed to optimize the selection of GUI-Tool paths for Computer Use Agents (CUAs). CUAs utilize both basic GUI actions (like clicking and typing) and advanced tool commands (such as API-driven file operations). However, they often face challenges in determining whether to persist with GUI actions or transition to tools, leading to inefficient execution paths. This issue stems from a lack of quality interleaved GUI-Tool trajectories, the difficulties and fragility associated with gathering real tool trajectories, and insufficient trajectory-level guidance for path selection. ToolCUA features an Interleaved GUI-Tool Trajectory Scaling Pipeline that utilizes plentiful static GUI trajectories and creates a grounded tool library, facilitating varied GUI-Tool trajectories without the need for manual engineering or actual tool-trajectory collection. The paper can be found on arXiv with the identifier 2605.12481.

Key facts

  • ToolCUA is an end-to-end agent for Computer Use Agents.
  • It learns optimal GUI-Tool path selection.
  • CUAs use both atomic GUI actions and high-level tool calls.
  • The hybrid action space causes uncertainty in path selection.
  • Scarcity of high-quality interleaved trajectories is a challenge.
  • The Interleaved GUI-Tool Trajectory Scaling Pipeline repurposes static GUI trajectories.
  • It synthesizes a grounded tool library.
  • The paper is on arXiv with ID 2605.12481.

Entities

Institutions

  • arXiv

Sources