ExpSeek: Self-Triggered Experience Seeking for Web Agents

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

ExpSeek is a novel framework that enhances web agent interaction capabilities by enabling step-level proactive experience seeking, rather than passive global context injection. It determines intervention timing by estimating step-level entropy thresholds using the model's intrinsic signals and designs tailor-made experience content for each step. Experiments conducted on Qwen3-8B and Qwen3-32B models across four web agent benchmarks (GAIA, WebWalkerQA, xbench-DeepSearch, Seal-Hard) demonstrate significant absolute improvements of 9.3% and 7.5% respectively. The framework utilizes a 4B small-scale experience model to boost larger agent models, validating entropy as an effective self-triggering signal. ExpSeek also shows strong cross-task generalization, outperforming traditional passive injection methods by 6.7% and 6.0% and increasing pass@3 performance by 12.9% and 8.8%.

Key takeaway

For Research Scientists developing web agents, ExpSeek offers a compelling alternative to traditional passive experience injection. You should consider integrating step-level, entropy-triggered guidance to improve agent reliability and performance in dynamic, noisy web environments. This approach allows agents to adapt to changing contexts and can significantly boost accuracy, even when using smaller, more efficient experience models.

Key insights

ExpSeek uses step-level entropy to proactively trigger and generate tailored experience guidance for web agents, significantly improving performance.

Principles

Method

ExpSeek estimates step-level entropy thresholds via logistic regression and bootstrap resampling to determine intervention timing. It constructs an experience base of error/analysis/guidance triplets, then uses an experience model to generate contextualized guidance dynamically.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.