De novo design of functional nucleic acids of aptamers

· Source: Machine learning : nature.com subject feeds · Field: Science & Research — Life Sciences & Biology, Mathematics & Computational Sciences, Research Methodology & Innovation · Depth: Expert, extended

Summary

InstructNA is a novel framework designed for the de novo generation of functional nucleic acids (FNAs), such as transcription factor-binding DNA and protein-binding aptamers, without requiring structural information. It integrates nucleic acid large language models (NA-LLMs) with high-throughput systematic evolution of ligands by exponential enrichment (HT-SELEX) data. The framework involves continually pretraining an existing NA-LLM with HT-SELEX data to create a domain-adapted FNA-LLM, followed by training a lightweight decoder. A key component is the HC-HEBO (hill climbing–heteroscedastic and evolutionary Bayesian optimization) algorithm, which refines FNA design in a continuous latent space. InstructNA demonstrated superior performance, generating 100% and 200% more strong aptamer binders for LOX1 and CXCL5 protein targets, respectively, compared to traditional HT-SELEX, with sequence similarities as low as 38% to original aptamers.

Key takeaway

For AI Researchers and computational biologists focused on molecular design, InstructNA offers a robust approach to overcome limitations in traditional FNA discovery. Your teams should consider integrating NA-LLMs with HT-SELEX and Bayesian optimization to accelerate the development of novel aptamers and other functional nucleic acids, potentially yielding higher affinity binders with greater sequence diversity than conventional methods.

Key insights

InstructNA combines NA-LLMs and HT-SELEX with Bayesian optimization for efficient de novo functional nucleic acid design.

Principles

Method

InstructNA continually pretrains NA-LLMs with HT-SELEX data, trains a decoder, and uses the HC-HEBO algorithm for iterative, function-guided optimization of FNA sequences in a continuous latent space.

In practice

Topics

Code references

Best for: AI Researcher, AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.