Learning the PTM code through a coarse-to-fine mechanism-aware framework

2026-05-15 · Source: Machine learning : nature.com subject feeds · Field: Science & Research — Life Sciences & Biology, Artificial Intelligence & Machine Learning · Depth: Expert, short

Summary

COMPASS-PTM is a novel, mechanism-aware, coarse-to-fine learning framework designed to decipher the complex combinatorial "code" of post-translational modifications (PTMs). This framework unifies residue-level multi-label PTM prediction with enzyme-substrate assignment by jointly modeling PTM patterns and their catalytic regulators. Built upon protein language models, COMPASS-PTM integrates physicochemical descriptors and a crosstalk-aware prompting mechanism to learn biologically coherent patterns of cooperative and antagonistic modifications, while also addressing the dual long-tail distribution inherent in PTM data. The model significantly outperforms existing baselines across multiple proteome-scale benchmarks, achieving a 122% relative improvement in F1-score for multi-label site prediction and a 54% gain in zero-shot enzyme assignment. Furthermore, COMPASS-PTM demonstrates interpretable generalization, recovering canonical kinase motifs and linking missense variants to PTM disruptions and enzyme-substrate network rewiring.

Key takeaway

For AI Scientists and Research Scientists working on protein function and cellular signaling, COMPASS-PTM offers a robust framework to simultaneously predict PTM sites and their regulatory enzymes. You should consider integrating this mechanism-aware, coarse-to-fine learning approach to improve the accuracy and interpretability of your PTM analyses, especially when dealing with complex combinatorial codes and long-tail data distributions. This could lead to more precise understanding of protein regulation and disease mechanisms.

Key insights

COMPASS-PTM unifies PTM site prediction and enzyme assignment by modeling PTM patterns and catalytic regulators.

Principles

Integrate physicochemical descriptors for biological coherence.
Address dual long-tail distributions in PTM data.
Couple statistical learning with explicit biochemical knowledge.

Method

COMPASS-PTM uses a coarse-to-fine learning framework, building on protein language models, integrating physicochemical descriptors and a crosstalk-aware prompting mechanism to jointly model PTM patterns and enzyme-substrate assignments.

In practice

Predict multi-label PTM sites with high accuracy.
Perform zero-shot enzyme-substrate assignment.
Interpret mechanistic links of missense variants.

Topics

Post-translational Modifications
COMPASS-PTM
Enzyme-Substrate Assignment
Protein Language Models
Multi-label PTM Prediction

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.