Measuring progress toward AGI: A cognitive framework

2026-03-17 · Source: Google DeepMind · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Research Methodology & Innovation · Depth: Intermediate, medium

Summary

Google DeepMind, on March 17, 2026, introduced a new cognitive framework and a Kaggle hackathon aimed at empirically measuring progress toward Artificial General Intelligence (AGI). Their paper, "Measuring Progress Toward AGI: A Cognitive Taxonomy," proposes a scientific foundation rooted in psychology, neuroscience, and cognitive science, identifying 10 key cognitive abilities crucial for general intelligence in AI systems. These abilities include Perception, Generation, Attention, Learning, Memory, Reasoning, Metacognition, Executive functions, Problem solving, and Social cognition. To put this framework into practice, Google DeepMind launched a Kaggle hackathon with a \$200,000 prize pool, inviting the research community to design evaluations for five specific abilities: Learning, Metacognition, Attention, Executive functions, and Social cognition. Submissions are open from March 17 to April 16, with results announced on June 1.

Key takeaway

For AI and research scientists focused on AGI development, you should consider integrating Google DeepMind's cognitive taxonomy into your evaluation strategies. This framework offers a structured approach to benchmark AI systems against human cognitive abilities, providing empirical tools to track progress. Participate in the Kaggle hackathon by April 16 to contribute new evaluations for critical cognitive gaps and potentially win from the \$200,000 prize pool.

Key insights

A cognitive taxonomy and evaluation protocol are proposed to empirically measure AGI progress against human capabilities.

Principles

AGI evaluation requires a broad cognitive taxonomy.
Benchmark AI against human performance distributions.
Held-out test sets prevent data contamination.

Method

Evaluate AI systems across 10 cognitive abilities using tasks, collect human baselines, then map AI performance relative to human distribution for each ability.

In practice

Design evaluations for Learning, Metacognition, Attention, Executive functions, or Social cognition.
Use Kaggle's Community Benchmarks to test evaluations.

Topics

Artificial General Intelligence
Cognitive Science
AI Evaluation
Kaggle Hackathon
Cognitive Taxonomy
Benchmarking

Best for: AI Scientist, Research Scientist, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Google DeepMind.