AI IQ is here: a new site scores frontier AI models on the human IQ scale. The results are already dividing tech.

· Source: VentureBeat · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Emerging Technologies & Innovation · Depth: Intermediate, medium

Summary

AI IQ, a new startup project, assigns estimated intelligence quotients (IQ) and emotional intelligence (EQ) scores to over 50 large language models (LLMs) and visualizes them on a bell curve and scatter plots. Launched by Ryan Shea, co-founder of Stacks, the platform aggregates 12 benchmarks into four reasoning dimensions: abstract, mathematical, programmatic, and academic. The composite IQ is a straight average of these dimensions, with scores mapped using hand-calibrated difficulty curves that compress ceilings for easier benchmarks. As of mid-May 2026, OpenAI's GPT-5.5 leads with an IQ near 136, closely followed by Anthropic's Opus 4.7 (IQ 132, EQ 132) and Google's Gemini 3.1 Pro (IQ 131). The platform also includes an "Effective Cost" metric, revealing that top-tier models like GPT-5.5 and Opus 4.7 have per-task costs exceeding $30 and $50, respectively, while models like DeepSeek-V3.2 offer respectable IQs (112-120) for $1-$5 per task.

Key takeaway

For AI Engineers evaluating LLMs for enterprise deployment, you should prioritize a multi-dimensional assessment that includes not only cognitive performance (IQ) but also emotional intelligence (EQ) and, critically, effective cost. The narrowing intelligence gap between high-cost and mid-range models necessitates implementing model routing strategies, where you deploy expensive models only for complex tasks and more economical options for routine workloads, to optimize both performance and budget.

Key insights

AI IQ provides a unified framework for benchmarking LLMs across IQ, EQ, and cost, despite methodological debates.

Principles

Method

AI IQ calculates a composite IQ by averaging scores from 12 benchmarks across four reasoning dimensions: abstract, mathematical, programmatic, and academic. EQ is a 50/50 weighted composite of EQ-Bench 3 Elo and Arena Elo scores, with bias correction for Anthropic models.

In practice

Topics

Best for: AI Engineer, Machine Learning Engineer, NLP Engineer, Director of AI/ML, AI Architect, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.