Introducing the New Browser Automation Tool with Toolboxes in Foundry
Summary
The new Browser Automation Tool with Toolboxes in Foundry enhances AI agents' ability to interact with web interfaces, bridging the gap where reasoning alone falls short. Available as an MCP tool, it leverages Playwright workspaces for robust headless browser infrastructure, offering faster automation and real-time visibility. Key updates include its availability as an MCP-native tool in Toolboxes (Public Preview) for hosted agents, and the generally available Playwright workspaces as its execution model. New features like Live View (Public Preview) enable real-time issue detection for selector drift and navigation failures, while Take Control (Public Preview) allows human intervention for non-deterministic paths such as CAPTCHAs. It also supports private website browsing (Private Preview) for internal portals and authenticated flows, integrates observability in Foundry Control Plane, and allows choice of open-source reasoning layers. This tool facilitates end-to-end agent workflows, enterprise-grade systems, and human-in-the-loop experiences for tasks like form filling and web research.
Key takeaway
For AI Engineers building agentic workflows that interact with web interfaces, the updated Browser Automation Tool in Foundry offers critical capabilities. You can now reliably automate complex web tasks, including private sites, by integrating Playwright-powered agents with human-in-the-loop controls. Utilize Live View for real-time debugging and Take Control to manage unpredictable UI elements like CAPTCHAs, ensuring robust, enterprise-grade automation. Evaluate this tool to enhance your agents' ability to complete end-to-end workflows beyond API limitations.
Key insights
The Browser Automation Tool extends AI agents' web interaction capabilities with human-in-the-loop controls and robust infrastructure.
Principles
- AI agents need web interaction beyond APIs.
- Observability improves automation reliability.
- Human judgment enhances edge-case handling.
Method
Define browser workflow in Hosted Agent setup, run execution with Live View, detect and resolve issues, use Take Control on complex branches, then continue automation and capture outcomes.
In practice
- Automate high-volume form filling.
- Scale web-based research flows.
- Integrate human-in-the-loop for CAPTCHAs.
Topics
- Browser Automation
- AI Agents
- Playwright Workspaces
- Foundry Platform
- Live View Debugging
- Human-in-the-Loop AI
Best for: AI Architect, AI Product Manager, AI Engineer, MLOps Engineer, Automation Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Microsoft Foundry Blog articles.