VLAA-GUI: Modular Framework for GUI Automation with Stop, Recover, and Search
VLAA-GUI, a newly developed modular framework for autonomous GUI agents, tackles two significant issues: early stopping and repetitive loops. It comprises three main elements: a Completeness Verifier that ensures UI-observable success criteria are met and scrutinizes completion assertions against decision rules, dismissing those lacking direct visual proof; a Loop Breaker that implements multi-tier filtering by altering interaction modes following repeated failures, prompting strategy adjustments after consistent screen-state occurrences, and linking reflection signals to these strategy changes; and a Search Agent that can search online for unfamiliar workflows as needed. This system aims to enhance the reliability and efficiency of GUI automation agents.
Key facts
- VLAA-GUI is a modular framework for GUI automation
- Addresses early stopping and repetitive loops
- Includes Completeness Verifier, Loop Breaker, and Search Agent
- Completeness Verifier enforces UI-observable success criteria
- Loop Breaker provides multi-tier filtering
- Search Agent searches online for unfamiliar workflows
- Framework aims to improve reliability and efficiency
Entities
—