It's time to be right.

· Source: Marc Brooker's Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, medium

Summary

The author, speaking at AI Dev 26, posits that the future opportunity for agentic AI in development and knowledge work will be constrained more by defect rate than by inherent capabilities. This hypothesis is illustrated through a four-block matrix categorizing defects by frequency (high/low) and seriousness (high/low), identifying the "low defect frequency, low defect seriousness" quadrant as key for widespread adoption. The article argues for shifting focus from the "right tail" of positive outcomes to the "left tail" of defects, which currently receives insufficient attention. AWS is addressing this through initiatives like correct-by-construction tools such as Hydro and Cedar, spec-driven development with Kiro, code reasoning via Strata and Lean, autoformalization in Bedrock AR Checks, and deterministic agent steering with Strands Steering. The author also advocates for industry-wide changes, including benchmarks that capture failure severity, an end-to-end view of agent success beyond just code patching, and a research program into agentic AI failure modes.

Key takeaway

For AI Engineers deploying agentic systems, recognize that defect rates, not just capabilities, will dictate real-world adoption and business success. Prioritize investing in tools and processes that minimize defect frequency and seriousness, such as correct-by-construction languages, spec-driven development, and formal code reasoning. Your focus should shift from solely maximizing agent performance to rigorously understanding and mitigating failure modes to ensure broad utility and customer trust.

Key insights

Agentic AI adoption hinges on minimizing defect rates and seriousness, not just maximizing capabilities.

Principles

Method

Improve agentic AI correctness through correct-by-construction tools (Hydro, Cedar), spec-driven development (Kiro), formal code reasoning (Strata, Lean), autoformalization (Bedrock AR Checks), and deterministic steering (Strands Steering).

In practice

Topics

Code references

Best for: CTO, VP of Engineering/Data, AI Architect, AI Engineer, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Marc Brooker's Blog.