Prediction and functional interpretation of inter-chromosomal genome architecture from DNA sequence with TwinC
Summary
TwinC, an interpretable convolutional neural network, reliably predicts inter-chromosomal (trans) DNA contacts, an under-characterized aspect of 3D genome folding. Current models primarily focus on intra-chromosomal (cis) folding, overlooking these trans-genome interactions. TwinC achieved an AUROC of 0.80 on a cross-chromosomal test set derived from in situ and intact Hi-C experiments in heart tissue. The model was also trained using in situ Hi-C data from the GM12878 cell line and validated with orthogonal DNA SPRITE assays in the same cell type. Mechanistic analysis revealed that TwinC learns the significance of compartments, chromatin accessibility, clustered transcription factor binding, and G-quadruplexes in forming these trans contacts, thereby illuminating their role in gene regulation.
Key takeaway
For AI Scientists and Research Scientists developing 3D genome folding models, you should integrate inter-chromosomal (trans) contact prediction to enhance model completeness and biological relevance. TwinC demonstrates that incorporating features like chromatin accessibility and G-quadruplexes significantly improves predictive accuracy for these previously overlooked interactions, offering a more comprehensive understanding of gene regulation. Consider leveraging the publicly available TwinC code and data for your own research.
Key insights
TwinC predicts inter-chromosomal DNA contacts, revealing their mechanistic drivers and role in gene regulation.
Principles
- Trans-genome organization is crucial for 3D genome folding.
- Compartments and G-quadruplexes influence trans contacts.
Method
TwinC is a convolutional neural network trained on proximity ligation-dependent (Hi-C) and independent (DNA SPRITE) chromatin conformation assays to predict trans contacts.
In practice
- Use TwinC to predict trans contacts from DNA sequence.
- Analyze TwinC's interpretations for regulatory insights.
Topics
- TwinC Model
- Inter-chromosomal Genome Architecture
- Convolutional Neural Networks
- 3D Genome Folding
- Hi-C Assays
Code references
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.