Formalize Once, Edit the Rest: Efficient Lean-Based Answer Selection for Math Reasoning
Summary
BASE, a novel base-and-edit pipeline, significantly improves the efficiency and accuracy of Lean-based answer selection for mathematical reasoning with large language models (LLMs). Current methods independently autoformalize each of K sampled candidate answers, incurring high computational costs. BASE addresses this by formalizing only a single base candidate per problem and then deriving the remaining K-1 statements through in-place editing of the answer expression. It employs a trained rewriter model, LEANSCRIBE, to localize the answer in the base formalization and generate a reusable edit function. This approach achieves a Pareto improvement across 12 dataset/solver configurations, cutting autoformalizer calls by approximately 5x at K=8, with greater reductions expected as K increases.
Key takeaway
For research scientists and ML engineers developing LLM-based mathematical reasoning systems, adopting the BASE pipeline can drastically cut computational overhead for formal verification. If you are using Lean for answer selection, implementing a "formalize once, edit the rest" strategy with a rewriter model like LEANSCRIBE will improve selection accuracy while reducing autoformalizer calls by approximately 5x at K=8, scaling further with larger K. Consider integrating this approach to optimize your verification workflows.
Key insights
Efficient Lean-based answer selection for LLM math reasoning is achieved by formalizing once and editing the rest.
Principles
- Editing a base formalization reduces computational cost.
- Specialized rewriter models can localize and generate edit functions.
Method
The BASE pipeline formalizes one candidate, then uses a rewriter model (LEANSCRIBE) to localize the answer in the base formalization and generate an edit function to derive K-1 other formal statements.
In practice
- Verify LLM math reasoning outputs with Lean.
- Reduce autoformalization costs in answer selection.
Topics
- Large Language Models
- Mathematical Reasoning
- Formal Proof Assistants
- Lean
- Answer Selection
- Autoformalization
- LEANSCRIBE
Code references
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.