Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents

· Source: Apple Machine Learning Research · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

Ferret-UI Lite is a compact, end-to-end GUI agent designed for on-device operation across mobile, web, and desktop platforms, developed by Zhen Yang et al. and published in February 2026. This 3-billion-parameter model was built using techniques optimized for small models, including curating diverse GUI data from real and synthetic sources, enhancing inference-time performance via chain-of-thought reasoning and visual tool-use, and applying reinforcement learning with custom rewards. Ferret-UI Lite demonstrates competitive performance among small-scale GUI agents, achieving 91.6% on ScreenSpot-V2, 53.3% on ScreenSpot-Pro, and 61.2% on OSWorld-G for GUI grounding. For GUI navigation, it reached success rates of 28.0% on AndroidWorld and 19.8% on OSWorld. This work shares methods and lessons from developing compact, on-device GUI agents.

Key takeaway

For AI Scientists developing on-device GUI agents, Ferret-UI Lite demonstrates that competitive performance is achievable with compact models. You should consider integrating diverse data mixtures, chain-of-thought reasoning, and visual tool-use, alongside reinforcement learning, to optimize your agent's efficiency and accuracy on resource-constrained platforms.

Key insights

Ferret-UI Lite is a compact, end-to-end GUI agent optimized for on-device performance across diverse platforms.

Principles

Method

The Ferret-UI Lite agent was built by curating diverse GUI data, strengthening inference with chain-of-thought and visual tool-use, and applying reinforcement learning with designed rewards.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Researcher, AI Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Apple Machine Learning Research.