AIs can now often do massive easy-to-verify SWE tasks
Summary
An editorial analyst has significantly updated their AI timelines, now assigning an almost 30% probability to full AI R&D automation by the end of 2028, nearly double their previous 15% estimate. This accelerated forecast is driven by stronger-than-expected performance from models like Opus 4.5, Opus 4.6, and Codex 5.2 on "easy-and-cheap-to-verify software engineering (SWE) tasks that don't require much novel ideation" (ESNI tasks). The analyst now expects AIs to achieve a 50%-reliability time horizon of years to decades on ESNI tasks by EOY 2026, a substantial increase from prior expectations. Key factors include observed AI accomplishment of large ES tasks with moderate scaffolding, anticipated substantial training compute scale-up in 2026, and a larger-than-previously-thought scaffolding overhang. This suggests AI progress in 2026 will be faster than in 2025, with superexponential progress observed in ESNI task reliability.
Key takeaway
Research Scientists focused on AI development should integrate advanced scaffolding and robust, cheap-to-verify evaluation loops into their workflows. This approach capitalizes on current AI strengths in iterative problem-solving, potentially accelerating project timelines and enabling AIs to autonomously complete significant portions of well-specified research tasks. Be mindful that while AIs excel at ESNI tasks, human judgment and ideation remain critical for less defined or ideation-heavy research areas.
Key insights
AI progress on verifiable software tasks is accelerating superexponentially, shortening timelines for AI R&D automation.
Principles
- Easy-to-verify tasks enable iterative AI self-correction.
- Scaffolding significantly enhances AI task performance.
- AI R&D acceleration is self-reinforcing.
Method
AIs can develop test suites and iteratively optimize solutions against them, allowing for continuous progress on easy-and-cheap-to-verify tasks even with initial errors.
In practice
- Implement robust test suites for AI-driven development.
- Utilize scaffolding to enhance AI task completion.
- Focus on well-specified, iterative software projects for AI.
Topics
- AI Timelines
- AI R&D Automation
- Easy-to-Verify SWE Tasks
- AI Scaffolding
- Superexponential Progress
Best for: Research Scientist, AI Scientist, AI Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Redwood Research blog.