Customized Amazon Nova models improve molecular-property prediction in drug discovery

2026-04-15 · Source: Amazon Science homepage · Field: Science & Research — Health & Medical Research, Life Sciences & Biology, Research Methodology & Innovation · Depth: Advanced, medium

Summary

Amazon's Generative AI Innovation Center, in collaboration with Nimbus Therapeutics, has developed a customized Amazon Nova large language model (LLM) that significantly improves molecular-property prediction in drug discovery. This single, fine-tuned LLM unifies the prediction of 11 critical molecular properties across lipophilicity, permeability, and clearance, a task traditionally requiring multiple specialized graph neural networks (GNNs). The approach utilizes supervised fine tuning (SFT) and reinforcement fine tuning (RFT) on Nova 2 Lite, achieving accuracy comparable to or exceeding GNNs on 7 of 11 properties, with an average RMSE only 5% higher than baseline GNNs. This solution streamlines the drug discovery workflow, reduces operational complexity, and enables conversational AI capabilities for medicinal chemists, potentially accelerating drug development which currently takes 10-15 years and costs over $2 billion per drug.

Key takeaway

For medicinal chemists and biotech teams evaluating molecular properties, this advancement means you can consolidate multiple GNN-based prediction tasks into a single, interactive LLM. This not only simplifies your workflow and reduces infrastructure overhead but also enables conversational interaction for reasoning and molecular modification suggestions, accelerating early-stage drug design and increasing viable candidate throughput.

Key insights

Fine-tuned LLMs can unify complex molecular property prediction, outperforming or matching specialized GNNs.

Principles

Foundational knowledge precedes performance optimization.
Huber loss provides stable, effective RFT rewards.

Method

Customize a general-purpose LLM (Nova 2 Lite) using supervised fine tuning (SFT) on 55,000+ molecules, followed by reinforcement fine tuning (RFT) with Huber loss-based rewards.

In practice

Use Nova Forge for LLM customization on SageMaker.
Employ SFT for domain-specific knowledge acquisition.
Apply RFT for predictive judgment and error minimization.

Topics

Customized LLMs
Drug Discovery
Molecular Property Prediction
Graph Neural Networks
Supervised Fine Tuning

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Amazon Science homepage.