Mar 23, 2026ScienceLong-running Claude for scientific computing

· Source: Anthropic Research · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, medium

Summary

Anthropic researcher Siddharth Mishra-Sharma details a method for applying multi-day, autonomous agentic coding workflows to scientific computing tasks, even outside a researcher's primary domain. This approach, leveraging Claude Code, utilizes test oracles, persistent memory, and orchestration patterns to tackle complex projects that typically require months of human effort. The post uses the concrete example of implementing a differentiable cosmological Boltzmann solver in JAX, aiming for 0.1% accuracy against the reference CLASS implementation. This task, which involves evolving coupled equations for early universe components, demonstrates how a single agent can work sequentially, spawning subagents and using reference implementations to debug, contrasting with the parallel agent approach seen in the C compiler project. The workflow emphasizes clear instructions in a CLAUDE.md file, progress tracking in CHANGELOG.md, and Git for coordination, all managed within an HPC environment using SLURM and tmux.

Key takeaway

For AI Engineers or Research Scientists developing complex scientific software, adopting autonomous agentic workflows with tools like Claude Code can dramatically accelerate project timelines. By clearly defining objectives, establishing test oracles, and implementing persistent memory, you can enable agents to achieve sub-percent accuracy on tasks that would otherwise consume months of researcher time. Consider integrating these patterns to maximize compute utilization and compress development cycles.

Key insights

Autonomous AI agents can compress months of scientific coding work into days by leveraging structured workflows.

Principles

Method

Draft a detailed CLAUDE.md plan, iterate locally, track progress in CHANGELOG.md, use Git for coordination, and run the agent in a tmux session on an HPC cluster, potentially employing a "Ralph loop" for task completion.

In practice

Topics

Code references

Best for: AI Engineer, Research Scientist, AI Researcher

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Anthropic Research.