Awakening the Sleeping Agent: Lean-Specific Agentic Data Reactivates General Tool Use in Goedel Prover
Summary
Goedel-Prover-V2, an open-source model extensively trained on 1.8 million formal-math examples, exhibits a significant loss of general tool-use capabilities after domain specialization, with function-calling accuracy plummeting from 89.4% to nearly 0%. Researchers investigated whether this "agentic collapse" is reversible. They found that fine-tuning the specialized model with as few as 100 Lean-specific tool-use traces was sufficient to restore robust tool-calling behavior. This recovery was not domain-specific; the regained capability transferred effectively, improving performance on the Berkeley Function Calling Leaderboard from near zero to 83.8%, close to the base model's 89.4%. Additionally, on ProofNet, pass@32 improved from 21.51% to 25.81%, demonstrating practical utility within the domain.
Key takeaway
For AI Scientists and Machine Learning Engineers specializing models, be aware that heavy supervised fine-tuning can suppress general capabilities like tool use. If your specialized model shows reduced function-calling accuracy, consider fine-tuning with a small, domain-specific agentic dataset. This approach can reactivate dormant general abilities, potentially improving performance across diverse tasks without extensive retraining.
Key insights
Domain specialization can suppress general tool-use in models, but small amounts of agentic data can reactivate it.
Principles
- Domain specialization can suppress general capabilities.
- Tool-use capabilities are not permanently erased by fine-tuning.
Method
Fine-tuning a specialized model with a small dataset of domain-specific agentic traces (e.g., 100 Lean-specific traces) can restore general tool-use abilities.
In practice
- Use small agentic datasets for capability restoration.
- Apply Lean-specific traces to reactivate tool use.
Topics
- Goedel-Prover-V2
- Tool Use
- Supervised Fine-tuning
- Agentic Collapse
- Lean Language
Best for: AI Engineer, AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.