😸 OpenAI solved 5 of 10 "impossible" problems

2026-02-15 · Source: The Neuron · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation, Software Development & Engineering · Depth: Intermediate, long

Summary

OpenAI's unreleased model has reportedly solved at least five of ten "First Proof" research-level math problems, a set of unpublished challenges created by top mathematicians to test AI's raw reasoning without prior training data exposure. Publicly available models like ChatGPT and Gemini could only solve two. Concurrently, on February 13th, OpenAI published a physics preprint where GPT-5.2 proposed a formula for gluon particle interactions, a problem physicists considered impossible for decades, which was subsequently verified by Harvard and Cambridge researchers. This demonstrates significant advancements in AI's scientific reasoning capabilities. Separately, OpenAI has quietly removed "safely" and its "openly share" commitment from its IRS mission filings over the years.

Key takeaway

For AI scientists and researchers evaluating the frontier of AI capabilities, these developments signal a rapid shift from AI assisting science to AI actively performing breakthrough scientific discovery. You should anticipate AI models, even unreleased ones, to tackle increasingly complex, novel problems in mathematics and physics, potentially accelerating research timelines. Consider how your research roadmap might integrate AI for hypothesis generation or problem-solving in areas previously deemed intractable for automated systems.

Key insights

OpenAI's unreleased models are demonstrating advanced scientific reasoning, solving complex math and physics problems previously thought impossible for AI.

Principles

AI can achieve breakthroughs in fundamental science.
Unseen problems test true AI reasoning, not pattern matching.

Method

OpenAI's internal model tackled 10 unpublished math problems in a week, with limited human oversight for output review and expansion, but without proof strategy input.

In practice

Use negative examples in prompts to improve AI accuracy by 20%.
Provide output templates to AI for significant quality gains.

Topics

OpenAI Research
AI Agent Development
Prompt Engineering
AI Ethics & Governance
Global AI Adoption

Code references

google-gemini/gemini-skills

Best for: Machine Learning Engineer, NLP Engineer, AI Scientist, AI Engineer, Prompt Engineer, Tech Journalist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Neuron.