EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CDR Design via Protein Language Model Adaptation

· Source: Takara TLDR - Daily AI Papers · Field: Science & Research — Life Sciences & Biology, Mathematics & Computational Sciences · Depth: Expert, quick

Summary

EvoStruct is a novel method designed for antibody Complementarity-Determining Region (CDR) design, addressing the "vocabulary collapse" issue prevalent in existing equivariant graph neural network (GNN) approaches. Current GNNs over-predict a limited set of amino acids, such as tyrosine and glycine, because they learn distributions de novo from restricted structural data, overlooking crucial evolutionary substitution patterns. EvoStruct resolves this by integrating a frozen protein language model (PLM) with 3D structural context from an E(3)-equivariant GNN via a cross-attention adapter. It employs progressive PLM unfreezing and R-Drop consistency regularization to specifically combat vocabulary collapse. Evaluated on the CHIMERA-Bench dataset, EvoStruct achieved the highest amino acid recovery and lowest perplexity, improving sequence recovery by 16% and reducing perplexity by 43% relative to GNN baselines. It also recovered 2.3x greater amino acid diversity and demonstrated the highest binding-pair correlation with ground truth.

Key takeaway

For research scientists developing antibody Complementarity-Determining Region (CDR) design models, EvoStruct demonstrates a critical advancement in overcoming vocabulary collapse. Your current GNN-based methods may be over-predicting common amino acids, limiting functional diversity. Consider adopting hybrid architectures that bridge protein language models with structural GNNs, incorporating progressive unfreezing and consistency regularization to achieve significantly higher amino acid diversity and improved binding-pair correlation in your designs.

Key insights

EvoStruct integrates PLMs and GNNs to enhance antibody CDR design diversity and accuracy by leveraging evolutionary and structural priors.

Principles

Method

EvoStruct bridges a frozen protein language model (PLM) with 3D structural context from an E(3)-equivariant GNN via a cross-attention adapter, using progressive PLM unfreezing and R-Drop consistency regularization.

In practice

Topics

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.