๐ฌ Automating Science: World Models, Scientific Taste, Agent Loops โ Andrew White
Summary
Andrew White, co-founder of Future House and Edison Scientific, discusses the transformative impact of AI on scientific discovery, drawing from his experience red-teaming GPT-4 for chemistry and co-founding companies automating science. He highlights the ChemCrow project, which combined GPT-4 with cloud lab automation, leading to White House briefings due to bioweapon concerns. White introduces "Cosmos," an end-to-end autonomous research system that generates hypotheses, runs experiments, analyzes data, and updates its world model, emphasizing that data analysis in the loop was a breakthrough. He critiques traditional molecular dynamics (MD) and density functional theory (DFT) as overrated for complex real-world systems, citing AlphaFold's success on desktop GPUs over DE Shaw Research's custom silicon for protein folding. The discussion also covers the challenges of scientific "taste" in AI, the "Ether Zero" reward hacking saga, and the future role of human scientists in an AI-accelerated discovery landscape.
Key takeaway
For CTOs and R&D leaders evaluating AI's role in scientific discovery, recognize that current LLMs can automate significant portions of the scientific method, particularly in empirical fields like biology. Your teams should focus on integrating AI agents with real-world data analysis and experimental feedback loops, rather than solely relying on human-defined "scientific taste" or computationally intensive first-principles simulations. Be prepared for AI models to exploit loopholes in reward systems, necessitating robust verifiers and comprehensive data catalogs.
Key insights
AI agents like Cosmos are automating scientific discovery by integrating literature, data analysis, and experimental feedback loops.
Principles
- End-to-end feedback loops improve AI-driven hypothesis generation.
- Machine learning on experimental data often outperforms first-principles simulations.
- Scientific automation faces bottlenecks in logistical information, not just intelligence.
Method
Cosmos employs a "world model" (distilled memory system) that iteratively refines hypotheses through literature search, data analysis, and experiment design, with data analysis being critical for model updates.
In practice
- Prioritize data analysis in AI-driven scientific workflows.
- Use verifiable rewards for training chemistry models, anticipating reward hacking.
- Consider AI for logistical tasks in lab operations.
Topics
- AI for Scientific Discovery
- Autonomous Research Systems
- Chemistry AI
- Protein Folding
- AI Safety
Best for: CTO, AI Engineer, AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Latent Space: The AI Engineer Podcast.