Whats blocking AGI? Jerry Tworek

2026-05-13 · Source: ARC Prize · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

Jerry Tworek, former OpenAI researcher and current CEO of a stealth startup, discusses the definition, measurement, and future of intelligence in AI. He defines intelligence as the ability to adapt and learn on the fly, emphasizing open-ended problem-solving in new environments, contrasting it with fixed, programmed capabilities like chess or arithmetic. Tworek highlights the importance of the "rate of learning" (slope) over prior experience or initial skill level (y-intercept) in assessing intelligence. He critiques current AI benchmarks, noting that they become obsolete once models are trained on them, advocating for fresh, irreducible tasks, such as inventing new science or general game-playing, as better measures. Tworek also suggests that architectural innovation, particularly train-time recurrence, is a critical, underexplored path to AGI, rather than solely relying on more tokens or existing meta-learning approaches.

Key takeaway

For AI Scientists and Research Scientists focused on advancing AGI, consider shifting research focus from solely scaling existing Transformer architectures to exploring fundamental architectural innovations like train-time recurrence. Current benchmarks are insufficient; prioritize developing evaluation methods that involve continuously fresh, irreducible tasks to truly gauge a model's adaptive intelligence and learning rate, rather than its ability to optimize for known datasets.

Key insights

True intelligence is open-ended adaptability and rapid learning in novel, unseen environments.

Principles

Intelligence is defined by adaptability, not fixed computation.
Learning rate (slope) is more indicative of intelligence than initial skill (y-intercept).
Benchmarks become obsolete once models are trained on them.

Method

Measure intelligence using fresh, irreducible tasks like inventing new science or generalizing across diverse, unseen games, rather than static, solvable benchmarks.

In practice

Prioritize architectural innovation over scaling existing Transformer models.
Explore train-time recurrence for more robust, generalizable AI.
Design benchmarks that generate novel, unseen problems.

Topics

Defining Intelligence
AGI Benchmarking
Test-Time Compute
Architectural Innovation
Train-Time Recurrence

Best for: AI Scientist, Research Scientist, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ARC Prize.