I Solved an 'Impossible' Math Problem with AI

· Source: Siraj Raval · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Emerging Technologies & Innovation · Depth: Intermediate, long

Summary

An engineer used AI models (Claude Opus 4.5, GPT 5.1 Codecs, Gemini 3 Pro) and the Lean 4 theorem prover to tackle three of Paul Erdos's unsolved mathematical problems within 24 hours. The experiment successfully generated a machine-verified proof for a non-trivial special case of Erdos Problem 124, demonstrating that every integer greater than two can be written as a sum of distinct powers of two and three. While Erdos Problem 379 was also formally proven, Erdos Problem 64, involving cycles in networks, proved too complex for the current AI-human loop. The project highlights a significant reduction in the cost and expertise required for formal mathematical verification, making advanced theorem proving accessible to non-specialists.

Key takeaway

For AI Engineers or Research Scientists interested in formal mathematics, this experiment demonstrates that AI, when paired with a formal verification system like Lean 4, can dramatically reduce the time and specialized knowledge needed to prove theorems. You should consider exploring AI-assisted formalization for specific, structured mathematical conjectures, as the tools are now accessible enough for non-mathematicians to contribute to formalizing mathematical knowledge.

Key insights

AI-assisted formal verification significantly lowers the barrier to entry for advanced mathematical theorem proving.

Principles

Method

The method involves an iterative loop: feed a problem to AI, receive Lean 4 code, feed Lean's errors back to the AI, and repeat until the code compiles without errors or "sorry" statements.

In practice

Topics

Best for: AI Engineer, AI Researcher, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Siraj Raval.