Mango: Multi-Agent Web Navigation via Global-View Optimization

2026-04-22 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Purdue University researchers introduce Mango, a multi-agent web navigation method designed to enhance efficiency on complex websites by optimizing starting points. Unlike traditional web agents that begin from a root URL, Mango constructs a global view of a website through lightweight crawling and keyword-based search to identify relevant candidate URLs. It then employs a multi-armed bandit (MAB) problem formulation with Thompson Sampling to dynamically prioritize and select URLs, adaptively allocating the navigation budget. An episodic memory component stores navigation history and reflections to prevent redundant exploration. Evaluated on WebVoyager and WebWalkerQA benchmarks, Mango achieved a 63.6% success rate with GPT-5-mini on WebVoyager, outperforming the best baseline by 7.3%, and a 52.5% success rate on WebWalkerQA, surpassing the best baseline by 26.8%. The system demonstrates generalizability across various LLM backbones and its code is open-source.

Key takeaway

Research Scientists developing LLM-based web agents should consider integrating global website structure analysis and adaptive URL selection mechanisms. By moving beyond root-URL-only exploration and employing techniques like Thompson Sampling, you can significantly improve success rates and efficiency on complex web navigation tasks, even if it means a higher action count for solving more challenging, long-horizon problems. Explore open-source implementations like Mango to accelerate your agent development.

Key insights

Global website structure analysis and adaptive URL prioritization significantly improve web navigation efficiency for LLM agents.

Principles

Prioritize relevant entry points over root URL exploration.
Dynamically allocate navigation budget using Thompson Sampling.
Utilize episodic memory to learn from past navigation attempts.

Method

Mango constructs a global website view via crawling and search, models URL selection as a Multi-Armed Bandit problem with Thompson Sampling, and uses a reflection agent with episodic memory to update URL probabilities and avoid dead ends.

In practice

Implement lightweight web crawling for initial URL discovery.
Apply BM25 for initial URL relevance scoring.
Integrate reflection agents for dynamic feedback and learning.

Topics

Multi-Agent Web Navigation
Global-View Optimization
Thompson Sampling
Multi-Armed Bandit Problem
Episodic Memory

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.