Formal Conjectures: An Open and Evolving Benchmark for Verified Discovery in Mathematics

2026-05-13 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, quick

Summary

Formal Conjectures is a new, evolving benchmark comprising 2615 mathematical problem statements formalized in Lean 4, designed to evaluate advanced automated reasoning systems. This dataset includes 1029 open research conjectures, providing a zero-contamination benchmark for mathematical proof discovery, and 836 solved problems for proof autoformalization. Sourced from active mathematical research, the repository facilitates collaboration between mathematicians and AI systems. The benchmark has already led to new mathematical discoveries, including the resolution of open research conjectures. The project emphasizes correctness through a collaborative open-source model, where AI-generated proofs and disproofs act as an auditing mechanism. A standardized evaluation setup and baseline results on frozen subsets are provided to measure progress in automated reasoning on research-level mathematics.

Key takeaway

For AI Scientists developing automated reasoning systems, this benchmark offers a critical resource for evaluating capabilities on research-level mathematics. You should integrate Formal Conjectures into your evaluation pipelines to measure progress and identify areas for improvement in proof discovery and autoformalization, leveraging its zero-contamination open conjectures.

Key insights

Formal Conjectures offers a Lean 4 benchmark for evaluating automated reasoning on research-level mathematics.

Principles

Zero-contamination benchmarks are crucial for proof discovery.
AI-generated proofs can audit formalization correctness.

Method

The project uses a collaborative open-source framework where mathematicians formalize problems and AI systems attempt solutions, with AI outputs iteratively improving benchmark fidelity.

In practice

Use Lean 4 for formalizing mathematical problems.
Integrate AI systems for auditing formalizations.

Topics

Formal Conjectures
Automated Reasoning Systems
Lean 4
Mathematical Proof Discovery
Proof Autoformalization

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.