Neuro-Symbolic ODE Discovery with Latent Grammar Flow

2026-04-21 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences, Engineering & Applied Sciences · Depth: Expert, extended

Summary

Latent Grammar Flow (LGF) is a neuro-symbolic generative framework designed for discovering ordinary differential equations (ODEs) from data, offering interpretability and transferability beyond black-box models. Developed by Karin Yu, Eleni Chatzi, and Georgios Kissas, LGF embeds grammar-based equation representations into a discrete latent space, ensuring semantically similar equations are positioned closer using a behavioral loss. A discrete flow model then guides the recursive sampling of candidate equations that best fit observed data. LGF incorporates domain knowledge and constraints, such as stability, either through grammar rules or as conditional predictors. The framework introduces a grammar quantisation autoencoder (GQAE) for discrete latent embedding and employs a two-stage nested optimization process, using the Nelder-Mead algorithm for scalar value optimization. LGF demonstrates increased sample efficiency and accuracy compared to methods like ODEFormer, PySR, ProGED, and GODE across three benchmarks, particularly excelling in scenarios with partial observable systems and noise.

Key takeaway

For AI Scientists and Machine Learning Engineers working on discovering interpretable dynamical systems, LGF offers a robust approach to inferring ODEs from noisy data. Its ability to integrate domain knowledge, such as system order and stability, directly into the discovery process means you can identify physically meaningful equations more efficiently. Consider LGF for applications where model transparency and transferability are critical, especially when dealing with implicit ODEs or partially observed systems, as it outperforms many existing methods in accuracy and sample efficiency.

Key insights

LGF discovers ODEs from data by embedding grammar-based representations into a guided discrete latent space.

Principles

Symbolic formulations offer interpretability and transferability.
Grammars enforce syntactic constraints in equation discovery.
Behavioral similarity should guide latent space organization.

Method

LGF uses a grammar quantisation autoencoder (GQAE) for discrete latent embedding, applies a behavioral loss for semantic clustering, and employs a discrete flow model with conditional guidance for sampling, followed by Nelder-Mead optimization for scalar values.

In practice

Integrate domain knowledge like stability into ODE discovery.
Use discrete latent spaces to reduce rejection sampling.
Employ two-stage optimization for scalar identification.

Topics

Neuro-Symbolic AI
Ordinary Differential Equations
Symbolic Regression
Latent Grammar Flow
Grammar Quantisation Autoencoder

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.