From discovery to design: in conversation with Ali Madani (Profluent)
Summary
Profluent CEO Ali Madani discusses the company's AI-first approach to protein design, highlighted by a \$2.25 billion deal with Eli Lilly to develop AI-designed gene editors for therapeutic use. Profluent, an AI lab, builds foundation models to design proteins from scratch, contrasting with traditional discovery methods. Their work, rooted in language models similar to GPT, uses transformer architectures trained on over 100 billion curated protein sequences (20 trillion tokens) to generate novel proteins. Madani emphasizes the "sequence first" paradigm, arguing it better captures complex function and dynamics than structure-based approaches. The Open CRISPR initiative demonstrated AI's ability to design molecules for human genome editing, achieving functionality rivaling millions of years of evolution. The Lilly deal specifically targets large gene insertion, addressing challenges previously intractable without AI, aiming for efficient and specific delivery of kilobase genetic payloads. Madani also notes the Verve Therapeutics' in vivo base editing success as a significant, scalable milestone.
Key takeaway
For AI Scientists or Directors of AI/ML evaluating new therapeutic design paradigms, Profluent's success with Eli Lilly demonstrates that AI-first generative models can move beyond accelerating discovery to enabling novel biological solutions. You should explore sequence-based language models for de novo protein design, especially for complex functions or large gene insertions, as this approach offers significant advantages over traditional or structure-only methods. Consider how your team can integrate similar foundation model capabilities to tackle previously intractable problems in drug development.
Key insights
AI-driven language models can design novel, functional proteins from scratch, accelerating biological evolution for therapeutic applications.
Principles
- Sequence-first models outperform structure-based approaches for complex protein function and dynamics.
- AI can enable solutions for previously intractable biological design problems.
- Scaling data and model parameters is crucial for advancing protein design capabilities.
Method
Profluent trains transformer-based language models on vast protein sequence data (100B+ sequences, 20T+ tokens) using mass language modeling or next token prediction, then layers metadata and laboratory feedback for alignment.
In practice
- Design de novo gene editors for precise human genome modification.
- Optimize multi-attribute protein properties like activity and specificity.
- Expand addressable mutations for base editing 10x beyond current methods.
Topics
- AI in Biology
- Protein Design
- Gene Editing
- Foundation Models
- Eli Lilly Partnership
- Therapeutic Development
Best for: Research Scientist, Entrepreneur, AI Scientist, Director of AI/ML, Investor
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Air Street Press.