PromoterAtlas: decoding regulatory sequences across Gammaproteobacteria using a transformer model

· Source: Machine learning : nature.com subject feeds · Field: Science & Research — Life Sciences & Biology, Artificial Intelligence & Machine Learning, Synthetic Biology · Depth: Expert, quick

Summary

PromoterAtlas is a 1.8 million parameter transformer model designed to decode regulatory sequences across Gammaproteobacteria. Trained on 9 million regulatory sequences from 3371 species, this model overcomes limitations of previous bacterial promoter prediction tools, which were often constrained by small datasets and species-specific training. PromoterAtlas accurately recognizes diverse regulatory elements, including ribosomal binding sites, various bacterial promoters, transcription factor binding sites, and terminators across different species. The model also functions as a whole-genome promoter annotation tool for Gammaproteobacteria, with validations supporting predictions for different sigma (σ) factors. Its embeddings reflect cross-species evolutionary relationships, clustering promoters by σ factor identity, and effectively predict transcription and translation levels.

Key takeaway

For synthetic biologists and bacterial geneticists working with Gammaproteobacteria, PromoterAtlas offers a robust tool for understanding and engineering bacterial regulatory sequences. You should consider integrating this model for comprehensive whole-genome promoter annotation and for predicting gene expression levels. Its ability to decode diverse regulatory elements across species can significantly enhance your experimental design and lead to more precise genetic modifications.

Key insights

PromoterAtlas is a transformer model that decodes bacterial regulatory sequences across thousands of Gammaproteobacteria species.

Principles

Method

PromoterAtlas, a 1.8M parameter transformer, was trained on 9M regulatory sequences from 3371 gammaproteobacterial species to recognize diverse regulatory elements and annotate whole genomes.

In practice

Topics

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.