From discovery to design: in conversation with Ali Madani (Profluent)

· Source: Air Street Press · Field: Science & Research — Artificial Intelligence & Machine Learning, Life Sciences & Biology, Pharmaceuticals & Biotechnology · Depth: Intermediate, extended

Summary

Profluent CEO Ali Madani discusses the company's AI-first approach to protein design, highlighted by a \$2.25 billion deal with Eli Lilly to develop AI-designed gene editors for therapeutic use. Profluent, an AI lab, builds foundation models to design proteins from scratch, contrasting with traditional discovery methods. Their work, rooted in language models similar to GPT, uses transformer architectures trained on over 100 billion curated protein sequences (20 trillion tokens) to generate novel proteins. Madani emphasizes the "sequence first" paradigm, arguing it better captures complex function and dynamics than structure-based approaches. The Open CRISPR initiative demonstrated AI's ability to design molecules for human genome editing, achieving functionality rivaling millions of years of evolution. The Lilly deal specifically targets large gene insertion, addressing challenges previously intractable without AI, aiming for efficient and specific delivery of kilobase genetic payloads. Madani also notes the Verve Therapeutics' in vivo base editing success as a significant, scalable milestone.

Key takeaway

For AI Scientists or Directors of AI/ML evaluating new therapeutic design paradigms, Profluent's success with Eli Lilly demonstrates that AI-first generative models can move beyond accelerating discovery to enabling novel biological solutions. You should explore sequence-based language models for de novo protein design, especially for complex functions or large gene insertions, as this approach offers significant advantages over traditional or structure-only methods. Consider how your team can integrate similar foundation model capabilities to tackle previously intractable problems in drug development.

Key insights

AI-driven language models can design novel, functional proteins from scratch, accelerating biological evolution for therapeutic applications.

Principles

Method

Profluent trains transformer-based language models on vast protein sequence data (100B+ sequences, 20T+ tokens) using mass language modeling or next token prediction, then layers metadata and laboratory feedback for alignment.

In practice

Topics

Best for: Research Scientist, Entrepreneur, AI Scientist, Director of AI/ML, Investor

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Air Street Press.