Andrej Karpathy's new open source 'autoresearch' lets you run hundreds of AI experiments a night — with revolutionary implications
Summary
Andrej Karpathy, former Tesla AI lead and OpenAI co-founder, released "autoresearch," a 630-line open-source Python script under an MIT License, designed to automate the scientific method using AI agents. This system functions as an autonomous optimization loop where an AI agent hypothesizes improvements to a training script, modifies code, runs experiments within a fixed compute budget (e.g., 5 minutes on a GPU), and evaluates results based on validation loss (val_bpb). If loss improves, the change is kept; otherwise, it reverts. In one overnight run, Karpathy's agent completed 126 experiments, reducing loss from 0.9979 to 0.9697. Over two days, it processed approximately 700 autonomous changes, finding 20 additive improvements that reduced "Time to GPT-2" by 11% from 2.02 hours to 1.80 hours. The project has inspired distributed applications, such as Hyperspace AI's peer-to-peer network of 35 agents running 333 experiments, rediscovering ML milestones like RMSNorm and tied embeddings in 17 hours. Marketing professionals are also exploring its application to automate thousands of experiments annually.
Key takeaway
For AI Engineers and Research Scientists aiming to accelerate model development and optimization, integrating autonomous research agents like Karpathy's autoresearch can dramatically increase experimental throughput. Your team could run hundreds of experiments overnight, identifying performance gains and architectural insights that human researchers might miss. Consider deploying these agents to iteratively refine training scripts, explore hyperparameter spaces, or even rediscover established ML techniques, shifting your focus to defining robust experimental constraints rather than manual execution.
Key insights
AI agents can autonomously conduct and optimize experiments, accelerating research and discovery across diverse fields.
Principles
- Automate the scientific method with AI agents.
- Distribute research tasks across peer networks.
- Optimize for performance per compute.
Method
An AI agent reads its own code, forms a hypothesis, modifies code, runs an experiment, evaluates results (e.g., validation loss), and iteratively keeps or reverts changes based on improvement.
In practice
- Apply autonomous loops to marketing A/B testing.
- Use distributed agents for rapid ML milestone rediscovery.
- Focus human effort on experimental design, not execution.
Topics
- Autoresearch
- Autonomous AI Agents
- Automated Experimentation
- Machine Learning Optimization
- Distributed AI Systems
Code references
Best for: AI Scientist, Research Scientist, AI Engineer, AI Researcher, Machine Learning Engineer, AI Product Manager
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.