OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents
Summary
OpenWebRL is an open framework for training visual web agents using online multi-turn Reinforcement Learning directly on real websites. It addresses the scalability bottleneck of static datasets. The framework provides a full training pipeline, including live-browser infrastructure, supervised initialization, multimodal context management, and efficient multi-turn policy optimization. Using this, OpenWebRL-4B was trained, achieving 67.0% success on Online-Mind2Web and 64.0% on DeepShop. This performance, with only 0.4K initialization trajectories and 2.2K RL training tasks, sets a new open-source benchmark. It also competes with proprietary systems like OpenAI CUA and Gemini CUA. The work systematically studies key design choices for effective online RL.
Key takeaway
For ML Engineers developing visual web agents, OpenWebRL offers a practical open-source path to overcome data scalability issues. You should explore integrating its online multi-turn RL framework to train agents directly on live websites. This can reduce reliance on expensive curated datasets. This approach can yield agents like OpenWebRL-4B, achieving 67.0% success on Online-Mind2Web. Such performance is competitive with proprietary systems and offers a cost-efficient alternative.
Key insights
OpenWebRL enables training visual web agents with online multi-turn RL on live websites, overcoming static dataset limitations.
Principles
- Online RL scales web agent training
- Multi-turn policy optimization is crucial
- Systematic design choices improve reasoning
Method
OpenWebRL provides a full training pipeline: live-browser infrastructure, supervised initialization, multimodal context management, trajectory-level success judging, and multi-turn policy optimization.
In practice
- Train agents directly on live websites
- Utilize 0.4K init and 2.2K RL tasks
- Achieve competitive benchmark performance
Topics
- Reinforcement Learning
- Web Agents
- Visual Agents
- Online Learning
- Multi-turn RL
- OpenWebRL
- Live Websites
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.