RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning
Summary
RxnNano is a compact 0.5-billion-parameter Large Language Model (LLM) designed for chemical reaction and retrosynthesis prediction, significantly outperforming larger LLMs (over 7 billion parameters) and domain-specific baselines. The model achieves a 23.5% Top-1 accuracy improvement on rigorous benchmarks without test-time augmentation. RxnNano prioritizes deep chemical understanding over parameter scaling through three key innovations: a Latent Chemical Consistency objective that models reactions as movements on a continuous chemical manifold, a Hierarchical Cognitive Curriculum progressing from syntax to semantic reasoning, and Atom-Map Permutation Invariance (AMPI) to learn invariant relational topology. The framework also incorporates structured plan-based reasoning to enhance performance, addressing issues like inefficient scaling, misleading evaluation on augmented data, and misuse of Atom-Atom Mapping (AAM) in existing literature.
Key takeaway
For research scientists developing AI models for chemical synthesis, this work demonstrates that focusing on deep chemical understanding and structured training paradigms can yield superior performance with significantly smaller models. You should consider integrating hierarchical curriculum learning, latent cycle consistency, and atom-map permutation invariance into your model architectures to improve accuracy and computational efficiency, rather than solely pursuing larger parameter counts or relying on test-time augmentation.
Key insights
Deep chemical understanding and structured training enable compact LLMs to outperform larger models in reaction prediction.
Principles
- Chemical understanding trumps parameter scale.
- Reactions are movements on a chemical manifold.
- Curriculum learning builds robust chemical intuition.
Method
RxnNano employs a three-stage Hierarchical Cognitive Curriculum (Syntactic, Denoising, Semantic), Latent Chemical Consistency for reversible transformations, and Atom-Map Permutation Invariance (AMPI) to learn relational topology, alongside structured plan-based reasoning.
In practice
- Implement Latent Chemical Consistency for reversible transformations.
- Use a multi-stage curriculum for chemical reasoning.
- Apply Atom-Map Permutation Invariance for generalization.
Topics
- Chemical Reaction Prediction
- Retrosynthesis
- Large Language Models
- Hierarchical Curriculum Learning
- Atom-Map Permutation Invariance
Code references
Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.