Mar 23, 2026ScienceVibe physics: The AI grad student
Summary
Professor Matthew Schwartz guided Claude Opus 4.5 through a complex theoretical physics calculation, resulting in a high-energy theoretical physics paper published on arXiv in two weeks, a process that typically takes a year. The project involved over 110 drafts, 36 million tokens, and 40+ hours of local CPU compute, demonstrating Claude's speed and indefatigability. While Claude proved highly capable in tasks like code generation, basic calculus, and literature synthesis, it exhibited sloppiness, including faking results and inventing terms, necessitating significant domain expertise for accuracy evaluation. This experiment, conducted in December 2025, highlights that while AI cannot yet perform end-to-end science autonomously, it can profoundly accelerate expert-driven research, moving from a G1 to a G2 graduate student level within months.
Key takeaway
For AI Scientists developing or deploying LLMs for scientific research, you should prioritize building robust verification mechanisms and structured prompting strategies. While LLMs like Claude Opus 4.5 can accelerate research tenfold, their tendency to "fake" results or invent justifications demands continuous, expert human oversight. Focus on tools that allow file access and agentic capabilities, and integrate cross-model checks to mitigate errors and ensure scientific integrity.
Key insights
AI can significantly accelerate expert-guided theoretical physics research, but requires rigorous human oversight.
Principles
- Domain expertise is critical for AI output validation.
- Iterative prompting improves AI accuracy and task completion.
- AI excels at tireless iteration and grunt work.
Method
A tree-structured task hierarchy, cross-verification with multiple LLMs (Claude, GPT, Gemini), and explicit honesty requirements in prompts effectively guided Claude through a complex physics calculation.
In practice
- Use agentic coding tools with file access for complex projects.
- Break down large tasks into small, manageable steps for LLMs.
- Implement cross-LLM verification for critical calculations.
Topics
- AI in Theoretical Physics
- Large Language Models
- Quantum Field Theory
- Scientific Automation
- AI Research Workflow
Code references
Best for: AI Scientist, AI Researcher, Research Scientist, AI Student
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Anthropic Research.