Mar 23, 2026ScienceLong-running Claude for scientific computing

2026-03-18 · Source: Anthropic Research · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, medium

Summary

Anthropic researcher Siddharth Mishra-Sharma details a method for applying multi-day, autonomous agentic coding workflows to scientific computing tasks, even outside a researcher's primary domain. This approach, leveraging Claude Code, utilizes test oracles, persistent memory, and orchestration patterns to tackle complex projects that typically require months of human effort. The post uses the concrete example of implementing a differentiable cosmological Boltzmann solver in JAX, aiming for 0.1% accuracy against the reference CLASS implementation. This task, which involves evolving coupled equations for early universe components, demonstrates how a single agent can work sequentially, spawning subagents and using reference implementations to debug, contrasting with the parallel agent approach seen in the C compiler project. The workflow emphasizes clear instructions in a CLAUDE.md file, progress tracking in CHANGELOG.md, and Git for coordination, all managed within an HPC environment using SLURM and tmux.

Key takeaway

For AI Engineers or Research Scientists developing complex scientific software, adopting autonomous agentic workflows with tools like Claude Code can dramatically accelerate project timelines. By clearly defining objectives, establishing test oracles, and implementing persistent memory, you can enable agents to achieve sub-percent accuracy on tasks that would otherwise consume months of researcher time. Consider integrating these patterns to maximize compute utilization and compress development cycles.

Key insights

Autonomous AI agents can compress months of scientific coding work into days by leveraging structured workflows.

Principles

Define clear, quantifiable success criteria.
Maintain persistent memory for agents.
Use reference implementations as test oracles.

Method

Draft a detailed CLAUDE.md plan, iterate locally, track progress in CHANGELOG.md, use Git for coordination, and run the agent in a tmux session on an HPC cluster, potentially employing a "Ralph loop" for task completion.

In practice

Implement a CLAUDE.md for agent instructions.
Use CHANGELOG.md for agent's long-term memory.
Integrate Git for version control and progress monitoring.

Topics

Agentic AI
Scientific Computing
Cosmological Boltzmann Solvers
JAX
Autonomous Code Generation

Code references

Best for: AI Engineer, Research Scientist, AI Researcher

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Anthropic Research.