Google Goes to EXTREMES: In-Context Symbolic AI
Summary
Google DeepMind's latest research, published in late December 2025, introduces an "in-context symbolic AI" approach that forces Large Language Models (LLMs) to perform symbolic programming purely through inference. This method, demonstrated in a preprint, aims to enhance LLM planning capabilities by integrating intrinsic self-critique and Planning Domain Definition Language (PDDL) directly into the LLM's context window. Traditionally, LLMs are considered probabilistic and struggle with rigid logic, but Google's study shows that by defining a constrained "universe" with PDDL rules, predicates, and actions, LLMs can achieve nearly 90% accuracy in tasks like the Tower of Hanoi. The process involves plan generation, self-critique using PDDL, state tracking, verification, and revision, often employing an ensembling critique with majority voting to improve reliability. This approach suggests that LLMs possess latent reasoning capabilities that emerge under specific, highly constrained runtime environments.
Key takeaway
For AI Scientists and Research Scientists developing autonomous agents, your current stateless prompt structures are likely leading to low success rates and increased hallucination in complex reasoning tasks. You should formalize your domain using explicit rules like PDDL, modify system prompts to demand state transition outputs after every action, and implement an intrinsic critique or compiler loop. This forces the LLM to verify preconditions and track state, significantly improving deterministic behavior and accuracy, even if it requires more computational overhead.
Key insights
LLMs can perform symbolic reasoning with high accuracy when rigorously constrained by in-context PDDL definitions.
Principles
- Intrinsic verification improves probabilistic system reliability.
- Massively restricting solution space enables self-correction.
- Process-based verification is more effective than outcome-based.
Method
DeepMind's method forces LLMs to run symbolic programs by injecting PDDL domain definitions into the context window, enabling explicit precondition checks, state tracking, and an intrinsic self-critique loop with majority voting for plan validation.
In practice
- Formalize your domain with invariants for autonomous agents.
- Modify prompts to require state transition outputs.
- Implement a compiler loop for intrinsic critique.
Topics
- In-Context Symbolic AI
- LLM Planning
- PDDL
- Intrinsic Self-Critique
- Compute Inefficiency
Best for: AI Scientist, Research Scientist, AI Researcher, AI Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Discover AI.