Mar 23, 2026ScienceLong-running Claude for scientific computing
Summary
Anthropic researcher Siddharth Mishra-Sharma details a method for applying multi-day, autonomous agentic coding workflows to scientific computing tasks, even outside a researcher's primary domain. This approach, leveraging Claude Code, utilizes test oracles, persistent memory, and orchestration patterns to tackle complex projects that typically require months of human effort. The post uses the concrete example of implementing a differentiable cosmological Boltzmann solver in JAX, aiming for 0.1% accuracy against the reference CLASS implementation. This task, which involves evolving coupled equations for early universe components, demonstrates how a single agent can work sequentially, spawning subagents and using reference implementations to debug, contrasting with the parallel agent approach seen in the C compiler project. The workflow emphasizes clear instructions in a CLAUDE.md file, progress tracking in CHANGELOG.md, and Git for coordination, all managed within an HPC environment using SLURM and tmux.
Key takeaway
For AI Engineers or Research Scientists developing complex scientific software, adopting autonomous agentic workflows with tools like Claude Code can dramatically accelerate project timelines. By clearly defining objectives, establishing test oracles, and implementing persistent memory, you can enable agents to achieve sub-percent accuracy on tasks that would otherwise consume months of researcher time. Consider integrating these patterns to maximize compute utilization and compress development cycles.
Key insights
Autonomous AI agents can compress months of scientific coding work into days by leveraging structured workflows.
Principles
- Define clear, quantifiable success criteria.
- Maintain persistent memory for agents.
- Use reference implementations as test oracles.
Method
Draft a detailed CLAUDE.md plan, iterate locally, track progress in CHANGELOG.md, use Git for coordination, and run the agent in a tmux session on an HPC cluster, potentially employing a "Ralph loop" for task completion.
In practice
- Implement a CLAUDE.md for agent instructions.
- Use CHANGELOG.md for agent's long-term memory.
- Integrate Git for version control and progress monitoring.
Topics
- Agentic AI
- Scientific Computing
- Cosmological Boltzmann Solvers
- JAX
- Autonomous Code Generation
Code references
Best for: AI Engineer, Research Scientist, AI Researcher
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Anthropic Research.