Google's Warning: ICL Context is Inert

2026-02-06 · Source: Discover AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Advanced, long

Summary

A Google DeepMind, Brown University, and New York University study, published February 4, 2026, reveals that large language models (LLMs) struggle to utilize representations learned through in-context learning (ICL). Despite successfully encoding complex topologies, such as 5x5 grids, into their internal residual streams with high accuracy (up to 85% distance correlation), LLMs fail to perform adaptive world modeling tasks that require deploying these learned representations. For instance, on a 16-state 1D chain, accuracy reached 60%, but on 2D grids (4x4, 5x5), accuracy plummeted below 50% for one-step tasks and under 20% for two-step or three-step tasks. This limitation persists across various open-weight LLMs (4B to 27B parameters) and even large proprietary reasoning models like "GPT5" with extensive chain-of-thought prompting (up to 5,000 tokens), where accuracy entirely collapses to less than 10% on 2D grids. The core issue appears to be the self-attention mechanism's inability to interpret and act upon the perfectly structured internal representations, rendering them functionally inert for complex reasoning.

Key takeaway

For AI Scientists developing or deploying LLMs for tasks requiring spatial or topological reasoning, this research indicates a fundamental limitation in current ICL and self-attention mechanisms. Your models may encode complex "maps" internally, but they cannot effectively "read" or act upon them for multi-step inferences. Consider architectural innovations beyond standard self-attention or specialized training for non-linear reasoning to overcome this functional inertness, especially for applications in chemistry, finance, or physics.

Key insights

LLMs encode complex topologies internally but cannot functionally utilize these in-context learned representations for multi-step reasoning.

Principles

Internal representations can be mathematically perfect yet functionally inert.
Self-attention mechanisms are optimized for linear sequences, not complex topologies.

Method

The study used an "adaptive world modeling" task, requiring LLMs to navigate novel steps on various topologies (1D chains, 2D grids) after few-shot examples, measuring accuracy for 1, 2, and 3-step complexities.

In practice

Avoid relying on ICL for complex, multi-step reasoning in non-linear domains.
Evaluate LLM performance on tasks requiring topological understanding beyond simple pattern extension.

Topics

In-Context Learning
LLM Limitations
Representation Learning
Self-Attention Mechanism
Adaptive World Modeling

Best for: AI Scientist, Research Scientist, AI Researcher, Machine Learning Engineer, Deep Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Discover AI.