FunctionEvolve: Structure-Guided Symbolic Regression with LLMs
Summary
FunctionEvolve is an evolutionary framework designed for symbolic regression, aiming to uncover explicit scientific laws from data by addressing the limitations of current LLM-driven systems. These existing systems often lack explicit mechanisms for local mutation and rely on brittle coefficient fitting, making them structure-blind. FunctionEvolve organizes its search using expression trees, employing structural summaries for diverse parent selection, local tree edits to preserve useful subexpressions, and structure-aware fitting to decompose, constrain, and simplify coefficients for reliable scoring. It uses only elementary function families, avoiding additional domain-specific rules. On the 129-task synthetic subset of LLM-SRBench, FunctionEvolve with Claude Opus 4.6 recovered 107 exact forms, achieving 82.9% SA@50, which is 4.5x above same-backbone baselines, and 55.8% SA@1, 3.6x above the strongest previously published top-1 result. Ablations confirm that structure-visible search is crucial for reliable recovery.
Key takeaway
For Machine Learning Engineers developing symbolic regression models, you should integrate explicit structural guidance into your search frameworks. FunctionEvolve demonstrates that using expression trees and structure-aware coefficient fitting dramatically improves the exact recovery of scientific laws from data. Consider adopting similar structure-visible search mechanisms and LLM-guided refinements to achieve higher accuracy and more reliable results in your own projects, especially when dealing with complex datasets.
Key insights
Structure-guided symbolic regression with LLMs significantly improves exact form recovery by using expression trees.
Principles
- Explicit structural guidance enhances symbolic regression.
- Decomposed coefficient fitting improves scoring reliability.
- Local tree edits preserve valuable subexpressions.
Method
FunctionEvolve uses expression trees for search organization, structural summaries for parent selection, local tree edits, and structure-aware coefficient fitting for reliable scoring.
In practice
- Implement expression trees for symbolic regression search.
- Decompose and constrain coefficients during model fitting.
- Use LLMs for guided refinements in evolutionary search.
Topics
- Symbolic Regression
- Large Language Models
- Evolutionary Algorithms
- Expression Trees
- FunctionEvolve
- Coefficient Optimization
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.