What the Freakiness of 2025 in AI Tells Us About 2026
Summary
The year 2025 saw significant advancements in AI, particularly with reasoning models like Gemini 3 Pro, which demonstrated enhanced benchmark performance but also raised questions about output diversity. "Playable worlds" emerged with models like Genie 3, capable of generating consistent, dynamic environments from text or image prompts. Realism in generative AI increased with models such as VO 3.1 and Sora 2, leading to a mainstreaming of "AI slop" and a decline in public trust regarding digital content. Despite these concerns, there was encouraging news beyond frontier models, including Google's Dolphin Gemma for decoding dolphin language. Public perception of AI remained mixed, with a slight net positive overall but strong negative sentiment towards AI art. Governments worldwide began enlisting AI, from prime ministerial assistance to military applications, often with mixed results. OpenAI's GPT-5, while highly anticipated, faced criticism for persistent hallucinations despite its perceived "PhD-level expert" capabilities, though GPT 4.5 quietly passed the Turing test. Chinese and open-weight models, like GLM 4.7 and Nvidia's Neotron 3, showed significant performance increases, challenging the dominance of frontier models. The Meter Time Horizons benchmark gained prominence for evaluating model task completion times, though its limitations in generalization and sample size were noted.
Key takeaway
For AI Scientists and Research Scientists evaluating the trajectory of AI development, you should recognize that while scaling continues to yield returns, the "single axis of intelligence" view is likely incomplete. Focus on developing models that demonstrate continuous learning and address the "jagged capabilities" problem, rather than solely optimizing for benchmarks that can be gamed. Your research should also explore automated information discovery paradigms, as these represent a significant next step beyond current LLM capabilities, offering practical benefits in areas like code optimization and scientific discovery.
Key insights
AI progress in 2025 highlighted both advanced reasoning and generative capabilities, alongside challenges in trust and general intelligence.
Principles
- Scaling parameters and data yields significant AI improvements.
- AI aptitude is spiky, showing impressive gains in specific domains.
- Data quality is crucial; poor data can degrade LLM capabilities.
Method
Automated information discovery combines LLMs with automated tests and evolutionary loops, allowing models to propose, evaluate, and save improved programs, accelerating scientific and engineering tasks.
In practice
- Use frontier models for rapid upskilling in unfamiliar domains.
- Implement automated discovery for optimizing code and algorithms.
- Prioritize high-quality data curation for continuous LLM pre-training.
Topics
- Reasoning Models
- Generative AI
- AI Benchmarking
- Open-weight Models
- Automated Discovery
Best for: AI Scientist, Research Scientist, AI Engineer, AI Researcher, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI Explained.