Best AI Models Today (AGI-2 TEST)

2026-05-07 · Source: Discover AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, quick

Summary

The Arc AGI2 leaderboard, updated May 7th, 2026, evaluates various large language models, with human panel performance reaching 100%. The top-performing model is GPD 5.5 XH High, achieving 85%. Other notable GPD models include GPD 5.5 Pro High at 84.66% and GPD 5.5 Medium at 70%. Gemini 3.1 Pro also shows strong performance, and a specific Gemini 3 Deep Sync model from February 2026 scored 84.6%, but at a higher cost of $13 compared to GPD models priced around $1.87 or less than a dollar. Entropic's Claude 4.7 Max achieved 75% at a cost of $7, while other Claude models like 4.6 High and 4.7 High also feature on the leaderboard.

Key takeaway

For AI Engineers evaluating large language models for deployment, prioritize GPD 5.5 XH High for its leading 85% performance on the Arc AGI2 leaderboard. Carefully compare its cost-effectiveness against alternatives like Gemini 3 Deep Sync, which offers similar performance but at a significantly higher price point. Your selection should balance raw performance with budget constraints.

Key insights

GPD 5.5 XH High leads the Arc AGI2 leaderboard with 85% performance, often at lower costs.

Principles

Performance varies significantly across models.
Cost-performance ratio is a critical evaluation metric.

In practice

Compare GPD 5.5 XH High for top performance.
Evaluate Gemini 3 Deep Sync for high performance despite higher cost.
Consider Claude 4.7 Max for a balance of performance and cost.

Topics

Arc AGI2 Leaderboard
GPD 5.5 Series
Gemini 3 Deep Syncing
Entropic Claude
AI Model Performance

Best for: CTO, VP of Engineering/Data, AI Engineer, AI Scientist, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Discover AI.