Andrej Karpathy says humans are now the bottleneck in AI research with easy-to-measure results

· Source: The Decoder · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Andrej Karpathy asserts that humans are now the bottleneck in AI research, particularly in areas with objective metrics, advocating for autonomous agents to take over tasks like hyperparameter tuning, as demonstrated by an agent outperforming his manual GPT-2 optimizations. He describes a "state of AI psychosis" where agents enable "macro actions" and significantly increase "token throughput" in software development, transforming workflows to delegation rather than direct coding. Beyond coding, Karpathy implemented a home automation system, "Dobby the elf claw," showcasing agents' ability to unify disparate smart home devices through natural language and APIs, suggesting a future where agents drive tool usage and refactor industries towards API-first designs. He envisions "auto research" where agents autonomously optimize models and even meta-optimize research processes, but cautions about the current "jaggedness" of AI capabilities, which excel in verifiable tasks but struggle with nuanced, less measurable domains. Karpathy also discusses the future of open-source AI, its increasing proximity to frontier models, and the long-term potential of robotics, emphasizing the interface between digital and physical worlds.

Key takeaway

Andrej Karpathy argues human intuition is now the bottleneck in AI research, demonstrating autonomous "auto research" agents can find overlooked GPT-2 hyperparameter tunings (e.g., weight decay, Adam betas) that he, with decades of experience, missed. This enables significantly more efficient model optimization and broader agentic applications like home automation. However, it also highlights challenges in non-verifiable domains and necessitates rethinking research abstractions for an "agent-first" web.

Topics

Code references

Best for: AI Scientist, Research Scientist, AI Researcher, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.