Playing Games With TheoryCoder | ARC Prize @ MIT

· Source: ARC Prize · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, long

Summary

Sam Gershman's lab at Harvard has developed theory-based reinforcement learning (RL) systems, exemplified by TheoryCoder, to address the efficiency and flexibility challenges in AI learning, particularly within video game environments. This approach aims to build AI systems that learn games similarly to humans by constructing domain-specific intuitive theories that capture causal laws. Theory-based RL, a form of model-based RL, learns underlying causal laws and uses this knowledge for planning and exploration, demonstrating significantly higher learning efficiency than deep learning systems. TheoryCoder specifically tackles challenges in efficiently inferring and utilizing theories by separating low-level game mechanics (Python functions) from high-level abstract predicates and operators (PDDDL), enabling optimized hierarchical planning. The system uses Large Language Models (LLMs) to infer theories from gameplay history, revising them with new observations, and has shown near-perfect success rates with fewer API calls compared to direct LLM planning in games like "Baba Is You" and "Sokoban".

Key takeaway

For AI Scientists and Machine Learning Engineers developing agents for complex, dynamic environments like video games, consider adopting a theory-based reinforcement learning approach. This method, particularly with hierarchical abstraction as demonstrated by TheoryCoder, offers superior learning efficiency and scalability compared to direct LLM planning. Your teams should focus on building systems that infer and revise intermediate causal theories, enabling robust and human-like adaptive behavior with fewer computational resources.

Key insights

Theory-based reinforcement learning with hierarchical abstraction offers scalable, human-like efficiency for complex sequential decision problems.

Principles

Method

TheoryCoder infers low-level game mechanics (Python) and high-level abstractions (PDDDL) using LLMs, then combines them for efficient hierarchical planning and theory revision based on new observations.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ARC Prize.