Executable Agentic Memory Boosts GUI Agent Performance
Researchers have introduced Executable Agentic Memory (EAM), a well-structured Knowledge Graph (KG) that transitions GUI planning from unstructured generation to efficient retrieval and execution. This method features a sample-efficient memory construction pipeline utilizing state-aware DFS and action-group mining. A value-guided graph search, powered by a lightweight Q-function model, directs Monte Carlo Tree Search (MCTS) across the KG. The study establishes theoretical bias-consistency for the Q-model and sets sample complexity limits for path recovery. EAM demonstrates a performance improvement of up to 19.6% over UI-TARS-7B on AndroidWorld while cutting token costs by a factor of 6. The research can be accessed on arXiv.
Key facts
- EAM is a structured Knowledge Graph for GUI planning.
- Memory construction uses state-aware DFS and action-group mining.
- Value-guided graph search employs MCTS with a lightweight Q-function model.
- Theoretical bias-consistency and sample complexity bounds are derived.
- EAM outperforms UI-TARS-7B by up to 19.6% on AndroidWorld.
- Token costs are reduced by 6×.
- Paper available on arXiv with ID 2605.12294.
Entities
Institutions
- arXiv