RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning

· Source: cs.LG updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Physical Sciences & Chemistry · Depth: Expert, extended

Summary

RxnNano is a compact 0.5-billion-parameter Large Language Model (LLM) designed for chemical reaction and retrosynthesis prediction, significantly outperforming larger LLMs (over 7 billion parameters) and domain-specific baselines. The model achieves a 23.5% Top-1 accuracy improvement on rigorous benchmarks without test-time augmentation. RxnNano prioritizes deep chemical understanding over parameter scaling through three key innovations: a Latent Chemical Consistency objective that models reactions as movements on a continuous chemical manifold, a Hierarchical Cognitive Curriculum progressing from syntax to semantic reasoning, and Atom-Map Permutation Invariance (AMPI) to learn invariant relational topology. The framework also incorporates structured plan-based reasoning to enhance performance, addressing issues like inefficient scaling, misleading evaluation on augmented data, and misuse of Atom-Atom Mapping (AAM) in existing literature.

Key takeaway

For research scientists developing AI models for chemical synthesis, this work demonstrates that focusing on deep chemical understanding and structured training paradigms can yield superior performance with significantly smaller models. You should consider integrating hierarchical curriculum learning, latent cycle consistency, and atom-map permutation invariance into your model architectures to improve accuracy and computational efficiency, rather than solely pursuing larger parameter counts or relying on test-time augmentation.

Key insights

Deep chemical understanding and structured training enable compact LLMs to outperform larger models in reaction prediction.

Principles

Method

RxnNano employs a three-stage Hierarchical Cognitive Curriculum (Syntactic, Denoising, Semantic), Latent Chemical Consistency for reversible transformations, and Atom-Map Permutation Invariance (AMPI) to learn relational topology, alongside structured plan-based reasoning.

In practice

Topics

Code references

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.