Google Goes to EXTREMES: In-Context Symbolic AI

· Source: Discover AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, long

Summary

Google DeepMind's latest research, published in late December 2025, introduces an "in-context symbolic AI" approach that forces Large Language Models (LLMs) to perform symbolic programming purely through inference. This method, demonstrated in a preprint, aims to enhance LLM planning capabilities by integrating intrinsic self-critique and Planning Domain Definition Language (PDDL) directly into the LLM's context window. Traditionally, LLMs are considered probabilistic and struggle with rigid logic, but Google's study shows that by defining a constrained "universe" with PDDL rules, predicates, and actions, LLMs can achieve nearly 90% accuracy in tasks like the Tower of Hanoi. The process involves plan generation, self-critique using PDDL, state tracking, verification, and revision, often employing an ensembling critique with majority voting to improve reliability. This approach suggests that LLMs possess latent reasoning capabilities that emerge under specific, highly constrained runtime environments.

Key takeaway

For AI Scientists and Research Scientists developing autonomous agents, your current stateless prompt structures are likely leading to low success rates and increased hallucination in complex reasoning tasks. You should formalize your domain using explicit rules like PDDL, modify system prompts to demand state transition outputs after every action, and implement an intrinsic critique or compiler loop. This forces the LLM to verify preconditions and track state, significantly improving deterministic behavior and accuracy, even if it requires more computational overhead.

Key insights

LLMs can perform symbolic reasoning with high accuracy when rigorously constrained by in-context PDDL definitions.

Principles

Method

DeepMind's method forces LLMs to run symbolic programs by injecting PDDL domain definitions into the context window, enabling explicit precondition checks, state tracking, and an intrinsic self-critique loop with majority voting for plan validation.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Researcher, AI Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Discover AI.