Best practices for computer and browser use with Claude
Summary
Anthropic has released best practices for integrating Claude's latest models (Claude 4.6 family and Opus 4.7) with computer and browser use capabilities, enabling complex agentic systems for software applications and workflow automation. Key recommendations include pre-downscaling screenshots to fit API limits (1568 pixels max long edge, 1.15 megapixels total for 4.6 family; 2576 pixels and 3.75 megapixels for Opus 4.7) to ensure click accuracy, with 1280x720 or 1080p for Opus 4.7 as starting resolutions. The article also details optimal thinking effort levels for different models and scenarios, with "medium" for Claude 4.6 and "high" for Opus 4.7 often being sweet spots. Furthermore, it emphasizes built-in prompt injection defenses, context management strategies like cache breakpoints and LLM-based compaction, and experimental features such as batch tools and an advisor tool. A "Teach Mode" concept, where demonstrations are recorded and replayed, is introduced to improve reliability and unlock new workflows.
Key takeaway
For AI Engineers building computer and browser automation with Claude, prioritize pre-downscaling screenshots to API limits (e.g., 1280x720 for 4.6 family, 1080p for Opus 4.7) and ensure coordinate scaling. Configure adaptive thinking to "medium" for Claude 4.6 or "high" for Opus 4.7 to balance accuracy and cost. Implement robust context management with cache breakpoints and server-side compaction to maintain performance and manage token usage in long-running agent sessions.
Key insights
Optimizing image resolution, thinking effort, and context management are crucial for reliable Claude-powered computer and browser automation.
Principles
- Click accuracy hinges on matching image resolution to API limits.
- Adaptive thinking optimizes reasoning for task complexity.
- Context management is vital for long-running agent cost and latency.
Method
Pre-downscale screenshots to API limits, place text instructions before images, and scale coordinates. Use adaptive thinking with optimal effort levels. Implement cache breakpoints, rolling buffers, and LLM-based compaction for context management.
In practice
- Start with 1280x720 resolution for Claude 4.6 family.
- Set Opus 4.7 thinking effort to "high" for most tasks.
- Implement human-in-the-loop for high-stakes agent actions.
Topics
- Claude Computer Use
- Screenshot Optimization
- Adaptive Thinking
- Prompt Injection Defense
- LLM Context Management
Code references
Best for: AI Engineer, Software Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Claude Blog.