We spent 2 hours working in the future

· Source: METR · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Advanced, long

Summary

METR conducted a 2-hour tabletop exercise to simulate future AI-augmented workflows, projecting capabilities 12-18 months ahead with ~200-hour time horizon AIs. Three researchers, playing themselves, interacted with these advanced agents while the simulated world operated on Feb 2026 technology. The exercise aimed to identify emerging workflows, bottlenecks, and potential speedups. Participants estimated a 3-5x uplift in productivity compared to Feb 2026 models. Key observations included agents implementing ideas instantly, shifting human focus to understanding results and checking work, and the necessity of sequencing long tasks for overnight agent execution. Prioritization and organization emerged as significant bottlenecks. Expected workflows involve declarative instructions, speculative execution, and agents generating "proofs of correctness." Project timelines are predicted to be dominated by serial human tasks, peer review, and data collection, with a hypothetical 6-week project involving only 8 hours of agent work.

Key takeaway

For Directors of AI/ML planning future R&D, recognize that highly capable AIs will fundamentally shift project bottlenecks from execution to human ideation, organization, and serial feedback. You should proactively design declarative workflows and strategies for managing multiple parallel agent-driven projects. Prepare your teams for a future where human roles resemble principal investigators, emphasizing critical review and strategic direction, as junior staff may find it challenging to contribute effectively without deep domain experience.

Key insights

Advanced AIs accelerate execution, shifting human bottlenecks to ideation, organization, and serial feedback loops.

Principles

Method

Researchers simulate future AI access (200h TH AIs) with current priorities, tracking actions and agent interactions via a gamemaster and spreadsheet over simulated ½-day turns.

In practice

Topics

Best for: AI Product Manager, AI Scientist, Research Scientist, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by METR.