Adaptability of MolmoWeb

· Source: Ai2 · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, quick

Summary

Malmo Web demonstrates significant adaptability, capable of learning new, unseen web-based tasks even on unfamiliar websites. When initially unable to perform a specific task, such as one provided by a microbiologist on an untrained website, Malmo Web can be fine-tuned using a small number of task trajectories. Researchers found that collecting 10 to 50 trajectories, either through a synthetic data generation pipeline or a human data collection tool, enabled Malmo Web to adapt effectively. The human data collection tool, implemented as a Chrome extension, records mouse and keyboard interactions, capturing screenshots, actions, accessibility trees, and other metadata. After collecting 25 trajectories for a task involving varying latitude and longitude, Malmo Web successfully completed the task, showcasing its ability to generalize from limited demonstrations.

Key takeaway

For AI Engineers developing web automation agents, if your Malmo Web instance struggles with a novel web task, consider generating 10-50 task trajectories. Utilize either the synthetic data pipeline or the human data collection tool to gather these demonstrations. Fine-tuning Malmo Web with this targeted data can significantly improve its performance and adaptability to previously unseen websites and task structures, enabling broader application.

Key insights

Malmo Web adapts to new web tasks on unseen sites via fine-tuning with few-shot demonstrations.

Principles

Method

Collect 10-50 task trajectories using synthetic generation or a Chrome extension-based human data collection tool, then fine-tune Malmo Web on these demonstrations.

In practice

Topics

Best for: AI Engineer, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Ai2.