Online Reinforcement Learning Agent for Web Navigation
A new research paper introduces OpAgent, an autonomous web agent that uses online reinforcement learning to navigate real-world websites. Unlike conventional methods relying on static datasets and supervised fine-tuning, OpAgent learns through direct iterative interactions with unconstrained web environments. The approach features hierarchical multi-task fine-tuning with datasets categorized into Planning, Acting, and Grounding primitives, establishing a Vision-Language Model with strong instruction-following capabilities. This addresses distributional shifts caused by offline trajectories failing to capture stochastic state transitions and real-time feedback.
Key facts
- Paper title: OpAgent: Operator Agent for Web Navigation
- arXiv ID: 2602.13559v2
- Proposes online reinforcement learning for web agents
- Uses hierarchical multi-task fine-tuning
- Datasets categorized by functional primitives: Planning, Acting, Grounding
- Based on Vision-Language Model
- Addresses distributional shift from offline methods
- Learns through direct iterative interactions with websites
Entities
Institutions
- arXiv