FunctionEvolve: Structure-Guided Symbolic Regression with LLMs

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Mathematics & Computational Sciences · Depth: Expert, quick

Summary

FunctionEvolve is an evolutionary framework designed for symbolic regression, aiming to uncover explicit scientific laws from data by addressing the limitations of current LLM-driven systems. These existing systems often lack explicit mechanisms for local mutation and rely on brittle coefficient fitting, making them structure-blind. FunctionEvolve organizes its search using expression trees, employing structural summaries for diverse parent selection, local tree edits to preserve useful subexpressions, and structure-aware fitting to decompose, constrain, and simplify coefficients for reliable scoring. It uses only elementary function families, avoiding additional domain-specific rules. On the 129-task synthetic subset of LLM-SRBench, FunctionEvolve with Claude Opus 4.6 recovered 107 exact forms, achieving 82.9% SA@50, which is 4.5x above same-backbone baselines, and 55.8% SA@1, 3.6x above the strongest previously published top-1 result. Ablations confirm that structure-visible search is crucial for reliable recovery.

Key takeaway

For Machine Learning Engineers developing symbolic regression models, you should integrate explicit structural guidance into your search frameworks. FunctionEvolve demonstrates that using expression trees and structure-aware coefficient fitting dramatically improves the exact recovery of scientific laws from data. Consider adopting similar structure-visible search mechanisms and LLM-guided refinements to achieve higher accuracy and more reliable results in your own projects, especially when dealing with complex datasets.

Key insights

Structure-guided symbolic regression with LLMs significantly improves exact form recovery by using expression trees.

Principles

Method

FunctionEvolve uses expression trees for search organization, structural summaries for parent selection, local tree edits, and structure-aware coefficient fitting for reliable scoring.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.