Best practices for computer and browser use with Claude

2026-05-13 · Source: Claude Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Advanced, extended

Summary

Anthropic has released best practices for integrating Claude's latest models (Claude 4.6 family and Opus 4.7) with computer and browser use capabilities, enabling complex agentic systems for software applications and workflow automation. Key recommendations include pre-downscaling screenshots to fit API limits (1568 pixels max long edge, 1.15 megapixels total for 4.6 family; 2576 pixels and 3.75 megapixels for Opus 4.7) to ensure click accuracy, with 1280x720 or 1080p for Opus 4.7 as starting resolutions. The article also details optimal thinking effort levels for different models and scenarios, with "medium" for Claude 4.6 and "high" for Opus 4.7 often being sweet spots. Furthermore, it emphasizes built-in prompt injection defenses, context management strategies like cache breakpoints and LLM-based compaction, and experimental features such as batch tools and an advisor tool. A "Teach Mode" concept, where demonstrations are recorded and replayed, is introduced to improve reliability and unlock new workflows.

Key takeaway

For AI Engineers building computer and browser automation with Claude, prioritize pre-downscaling screenshots to API limits (e.g., 1280x720 for 4.6 family, 1080p for Opus 4.7) and ensure coordinate scaling. Configure adaptive thinking to "medium" for Claude 4.6 or "high" for Opus 4.7 to balance accuracy and cost. Implement robust context management with cache breakpoints and server-side compaction to maintain performance and manage token usage in long-running agent sessions.

Key insights

Optimizing image resolution, thinking effort, and context management are crucial for reliable Claude-powered computer and browser automation.

Principles

Click accuracy hinges on matching image resolution to API limits.
Adaptive thinking optimizes reasoning for task complexity.
Context management is vital for long-running agent cost and latency.

Method

Pre-downscale screenshots to API limits, place text instructions before images, and scale coordinates. Use adaptive thinking with optimal effort levels. Implement cache breakpoints, rolling buffers, and LLM-based compaction for context management.

In practice

Start with 1280x720 resolution for Claude 4.6 family.
Set Opus 4.7 thinking effort to "high" for most tasks.
Implement human-in-the-loop for high-stakes agent actions.

Topics

Claude Computer Use
Screenshot Optimization
Adaptive Thinking
Prompt Injection Defense
LLM Context Management

Code references

anthropics/claude-quickstarts

Best for: AI Engineer, Software Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Claude Blog.