Diffusion model generating regulatory DNAs

· Source: Machine learning : nature.com subject feeds · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Life Sciences & Biology · Depth: Advanced, quick

Summary

A new generative AI model named DNA-Diffusion, developed by Luca Pinello's team at Harvard Medical School, can generate 200-bp DNA sequences with regulatory potential. Published on February 11, 2026, this diffusion model was trained on extensive DNase I hypersensitive site index data from three distinct cell lines. It learns to reconstruct the noise addition process to generate functional DNA sequences, conditional on a specified cell type. The generation process is controlled by a classifier-free guidance parameter, which balances sequence diversity and activity. In silico evaluations demonstrate that DNA-Diffusion generates sequences with high regulatory activities, including gene expression activation, without merely memorizing training data. Further analysis of these sequences reveals enriched regulatory motifs, such as transcription factor binding sites, offering mechanistic insights.

Key takeaway

For AI Researchers and Synthetic Biologists exploring novel genetic constructs, DNA-Diffusion offers a method to generate functional regulatory DNA sequences. Your team can leverage this approach to design targeted genetic elements for specific cell types, potentially accelerating gene therapy development or synthetic biology applications. Consider integrating diffusion models into your workflow for sequence design to explore diverse and active regulatory elements.

Key insights

DNA-Diffusion is a diffusion model generating functional regulatory DNA sequences conditional on cell type.

Principles

Method

DNA-Diffusion learns to reconstruct noise addition to generate 200-bp regulatory DNA sequences, conditioned on cell type, after training on DNase I hypersensitive site index data.

In practice

Topics

Best for: AI Researcher, AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.