What's Missing Between LLMs and AGI - Vishal Misra & Martin Casado

· Source: The a16z Show · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Advanced, extended

Summary

Vishal Misra's research reveals that large language models (LLMs), specifically transformers, update their predictions through a precise, mathematically predictable Bayesian process. His team developed a "Bayesian wind tunnel" to empirically and mathematically prove this, demonstrating transformers achieve 10^-3 bits accuracy in updating beliefs, outperforming Mamba, LSTMs, and MLPs. While LLMs excel at correlation (Shannon entropy), Misra argues they lack the post-training plasticity and causal understanding (Kolmogorov complexity) necessary for Artificial General Intelligence (AGI). He proposes AGI requires architectures capable of continual learning and moving beyond pattern matching to build causal models, citing the "Einstein test" where an LLM trained on pre-1916 physics would fail to derive relativity.

Key takeaway

For AI scientists and ML engineers designing next-generation systems, recognize that current LLMs, while adept at Bayesian updating and correlation, fundamentally lack post-training plasticity and causal reasoning. Your efforts should prioritize developing architectures that enable continual learning and move beyond Shannon entropy to build true causal models, rather than solely pursuing larger models or more training data, to advance towards AGI.

Key insights

LLMs, particularly transformers, execute precise Bayesian updating, but lack the plasticity and causal reasoning essential for AGI.

Principles

Method

The "Bayesian wind tunnel" tests blank architectures on non-memorizable tasks with analytically known Bayesian posteriors, proving precise updating.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The a16z Show.