Dertouzos Distinguished Lecturer: Richard Sutton

· Source: MIT CSAIL · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Emerging Technologies & Innovation · Depth: Intermediate, extended

Summary

Richard Sutton, a co-architect of modern reinforcement learning and co-author of the foundational "Sutton and Barto" textbook, presented his "OAK" (Options And Knowledge) architecture for achieving superintelligence from experience. This vision adheres to "The Bitter Lesson," which posits that general methods leveraging computation historically outperform approaches encoding human knowledge. OAK extends the consensus agent architecture by integrating "options" for temporal abstraction and "knowledge" as learned beliefs about option consequences. The architecture proposes that agents continuously learn policies, generate new state features, create subproblems from highly ranked features (specifically "reward respecting subproblems" for feature attainment), and learn transition models for these options. Sutton acknowledges that reliable continual deep learning and meta-learning are crucial missing prerequisites for OAK's large-scale realization.

Key takeaway

For AI Scientists and Machine Learning Engineers designing general intelligence systems, you should prioritize architectures that enable agents to discover their own abstractions and skills from raw experience. Avoid extensively building in domain-dependent knowledge, as this approach limits scalability and future progress. Instead, focus on developing reliable continual deep learning and meta-learning capabilities to foster open-ended, autonomous growth in agent complexity and conceptual structures.

Key insights

OAK proposes a domain-independent AI architecture that learns abstractions and skills from experience, scaling with computation.

Principles

Method

OAK continuously learns policies, generates new state features, creates subproblems from ranked features, learns option solutions and transition models, and plans with these "jumpy" models, all at runtime.

In practice

Topics

Best for: Research Scientist, AI Scientist, AI Student, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by MIT CSAIL.