Adaptability of MolmoWeb
Summary
Malmo Web demonstrates significant adaptability, capable of learning new, unseen web-based tasks even on unfamiliar websites. When initially unable to perform a specific task, such as one provided by a microbiologist on an untrained website, Malmo Web can be fine-tuned using a small number of task trajectories. Researchers found that collecting 10 to 50 trajectories, either through a synthetic data generation pipeline or a human data collection tool, enabled Malmo Web to adapt effectively. The human data collection tool, implemented as a Chrome extension, records mouse and keyboard interactions, capturing screenshots, actions, accessibility trees, and other metadata. After collecting 25 trajectories for a task involving varying latitude and longitude, Malmo Web successfully completed the task, showcasing its ability to generalize from limited demonstrations.
Key takeaway
For AI Engineers developing web automation agents, if your Malmo Web instance struggles with a novel web task, consider generating 10-50 task trajectories. Utilize either the synthetic data pipeline or the human data collection tool to gather these demonstrations. Fine-tuning Malmo Web with this targeted data can significantly improve its performance and adaptability to previously unseen websites and task structures, enabling broader application.
Key insights
Malmo Web adapts to new web tasks on unseen sites via fine-tuning with few-shot demonstrations.
Principles
- Few-shot learning enhances model adaptability.
- Human demonstrations improve model performance.
Method
Collect 10-50 task trajectories using synthetic generation or a Chrome extension-based human data collection tool, then fine-tune Malmo Web on these demonstrations.
In practice
- Use Chrome extension for web interaction data.
- Fine-tune with 25 trajectories for new tasks.
Topics
- Malmo Web
- Web Automation
- Model Fine-tuning
- Data Collection
- Human Data Collection
Best for: AI Engineer, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Ai2.