Mango: Multi-Agent Web Navigation via Global-View Optimization
Summary
Purdue University researchers introduce Mango, a multi-agent web navigation method designed to enhance efficiency on complex websites by optimizing starting points. Unlike traditional web agents that begin from a root URL, Mango constructs a global view of a website through lightweight crawling and keyword-based search to identify relevant candidate URLs. It then employs a multi-armed bandit (MAB) problem formulation with Thompson Sampling to dynamically prioritize and select URLs, adaptively allocating the navigation budget. An episodic memory component stores navigation history and reflections to prevent redundant exploration. Evaluated on WebVoyager and WebWalkerQA benchmarks, Mango achieved a 63.6% success rate with GPT-5-mini on WebVoyager, outperforming the best baseline by 7.3%, and a 52.5% success rate on WebWalkerQA, surpassing the best baseline by 26.8%. The system demonstrates generalizability across various LLM backbones and its code is open-source.
Key takeaway
Research Scientists developing LLM-based web agents should consider integrating global website structure analysis and adaptive URL selection mechanisms. By moving beyond root-URL-only exploration and employing techniques like Thompson Sampling, you can significantly improve success rates and efficiency on complex web navigation tasks, even if it means a higher action count for solving more challenging, long-horizon problems. Explore open-source implementations like Mango to accelerate your agent development.
Key insights
Global website structure analysis and adaptive URL prioritization significantly improve web navigation efficiency for LLM agents.
Principles
- Prioritize relevant entry points over root URL exploration.
- Dynamically allocate navigation budget using Thompson Sampling.
- Utilize episodic memory to learn from past navigation attempts.
Method
Mango constructs a global website view via crawling and search, models URL selection as a Multi-Armed Bandit problem with Thompson Sampling, and uses a reflection agent with episodic memory to update URL probabilities and avoid dead ends.
In practice
- Implement lightweight web crawling for initial URL discovery.
- Apply BM25 for initial URL relevance scoring.
- Integrate reflection agents for dynamic feedback and learning.
Topics
- Multi-Agent Web Navigation
- Global-View Optimization
- Thompson Sampling
- Multi-Armed Bandit Problem
- Episodic Memory
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.