Beyond Domains: Reusing Web Skills via Transferable Interaction Patterns
Summary
SkillMigrator is an agent designed to enhance large language model (LLM) web agents by improving skill reuse and reducing operational costs. Traditional LLM web agents, which operate as tool callers emitting low-level primitive actions, face high latency and cost on benchmarks like Mind2Web and WebArena due to extensive policy-facing LLM completions. While prior systems introduced web skills to wrap repeated interactions, their reliance on instruction similarity or coarse site metadata limited skill transferability across different websites. SkillMigrator addresses this by learning reusable web skills and transferring them via layout structure matching, rather than specific element references. It stores each induced skill as a transferable interaction pattern (TIP), comprising the skill and a structural sketch of the induction-time snapshot. During execution, SkillMigrator retrieves TIPs based on layout similarity and grounds their references on the live page. This approach reduces the average LLM-action count on successful trajectories by 8-10% across both WebArena and Mind2Web, maintaining a matched success rate.
Key takeaway
For Machine Learning Engineers developing LLM-powered web agents, if you are struggling with high inference costs and limited skill reuse across different websites, consider implementing a layout-structure-based skill transfer mechanism. Adopting an approach like SkillMigrator's, which matches interaction patterns by visual layout rather than specific element references, can reduce your average LLM-action count by 8-10% on benchmarks like WebArena and Mind2Web, directly lowering operational expenses and enhancing agent generalization.
Key insights
Web agent skill reuse is significantly improved by matching layout structure rather than specific element references.
Principles
- Web skills can be generalized across sites via layout structure.
- Store skills as "transferable interaction patterns" with structural sketches.
Method
SkillMigrator learns skills, stores them as Transferable Interaction Patterns (TIPs) with structural sketches, then retrieves TIPs by layout similarity and grounds references on live pages.
In practice
- Implement layout-based skill transfer for web agents.
- Reduce LLM inference costs in web automation.
Topics
- LLM Web Agents
- Web Skill Transfer
- Layout Structure Matching
- Transferable Interaction Patterns
- WebArena Benchmark
- Mind2Web Benchmark
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.