Structure-informed deep generation enables de novo metabolite annotation in untargeted metabolomics
Summary
MetGenX, a novel structure-informed encoder-decoder neural network, addresses the challenge of de novo metabolite annotation in untargeted metabolomics by generating metabolite structures directly from MS2 spectra. Published on April 20, 2026, MetGenX reformulates the spectrum-to-structure task as a structure-to-structure generation problem, leading to improved accuracy and chemical space coverage. In independent tests, it achieved a top-1 accuracy of 55.9% on 1388 NIST MS2 spectra and 68.5% on 1681 spectra from real biological samples, outperforming existing in silico tools. Its design ensures robust performance across both positive and negative ionization modes without requiring retraining. A multi-step annotation workflow using MetGenX successfully identified two previously uncharacterized metabolites in mouse liver untargeted metabolomics data, which were absent from major human metabolome databases.
Key takeaway
For metabolomics researchers struggling with identifying unknown metabolites, MetGenX offers a significant advancement by providing a highly accurate, structure-informed deep generation tool. You should consider integrating MetGenX into your untargeted metabolomics workflows to enhance de novo annotation capabilities and accelerate the discovery of novel chemical entities, especially for compounds not present in existing databases.
Key insights
MetGenX uses a structure-informed deep neural network to generate metabolite structures directly from MS2 spectra, improving annotation accuracy.
Principles
- Reformulate spectrum-to-structure as structure-to-structure generation.
- Structure-informed design ensures robust performance across ionization modes.
Method
MetGenX employs an encoder-decoder neural network to generate metabolite structures from MS2 spectra, leveraging a structure-informed approach to enhance accuracy and chemical space coverage.
In practice
- Apply MetGenX for de novo metabolite annotation.
- Utilize MetGenX for discovering uncharacterized chemical entities.
Topics
- MetGenX
- Metabolite Annotation
- Untargeted Metabolomics
- MS2 Spectra
- Encoder-Decoder Neural Network
Code references
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.