AF2BIND: predicting small-molecule binding sites using the pair representation of AlphaFold2

· Source: Machine learning : nature.com subject feeds · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Computational Structural Biology · Depth: Expert, extended

Summary

AF2BIND is a novel logistic regression model designed for accurate de novo prediction of small-molecule binding sites in proteins, leveraging features from the pretrained AlphaFold2 (AF2) neural network. Unlike traditional methods, AF2BIND operates without relying on homology modeling, multiple sequence alignments, or prior knowledge of a pocket-compatible ligand. It achieves this by using AF2's internal pair representation, augmented with 20 "bait" amino acids supplied as individual chains to tease out ligand-binding signals. The model demonstrates a 66% binding-residue recovery rate and a 0.936 ROC AUC, outperforming other single-representation features like ESM2 and ESM1-IF. AF2BIND has been applied to the human proteome, identifying over 20,000 binding sites, including thousands previously unassigned by homology-based methods or P2Rank, many of which are shallow or surface-exposed and potentially druggable.

Key takeaway

For AI Scientists and Research Scientists focused on drug discovery, AF2BIND offers a powerful, interpretable tool for identifying novel small-molecule binding sites. You should integrate AF2BIND into your early-stage drug discovery pipelines to uncover de novo ligandable sites, especially in proteins where traditional homology-based methods or pocket finders fall short, potentially accelerating the identification of new therapeutic targets.

Key insights

AlphaFold2's internal representations can be repurposed to accurately predict de novo small-molecule binding sites.

Principles

Method

AF2BIND uses AF2's pair representation, augmented with 20 "bait" amino acids, as input to a logistic regression model. It predicts the probability of each residue contacting a small-molecule ligand, without requiring MSAs or ligand knowledge.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist, AI Researcher, AI Data Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.