The Human/AI Frontier: A Conversation with Bogdan Grechuk

2026-02-19 · Source: Surge AI Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Expert, medium

Summary

Mathematician Bogdan, an Associate Professor at the University of Leicester, envisions using AI in a loop to solve complex Diophantine equations, where he invents methods and AI tests their applicability. A significant challenge is that current SOTA models, including GPT-5, Gemini, and Claude, often generate convincing but incorrect mathematical answers, requiring extensive manual verification, which Bogdan suggests could be addressed by integrating formal proof systems. For a specific Diophantine equation, these models initially failed by employing brute-force rather than deeper mathematical reasoning; however, GPT-5 successfully arrived at the correct solution when explicitly guided through advanced steps like transforming the equation to an elliptic curve. This indicates that while AI can solve PhD-level problems with clear guidance, it currently struggles with independently applying sophisticated mathematical methods, highlighting the ongoing need for human expertise in guiding AI towards autonomous research problem-solving and method generation.

Key takeaway

SOTA LLMs like GPT-5, Gemini, and Claude fail at complex Diophantine equations, defaulting to brute-force instead of mathematical reasoning. However, GPT-5 successfully solves these problems when guided through advanced methods like elliptic curve transformations, a capability other models couldn't replicate. This highlights AI's potential as a trustworthy mathematical research partner, contingent on integrating formal proof systems and developing deeper, method-driven reasoning.

Topics

Diophantine Equations
AI in Mathematics
Large Language Models
Mathematical Problem Solving
Formal Proof Systems

Best for: AI Researcher, AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Surge AI Blog.