Prospective multi-pathogen disease forecasting using autonomous LLM-guided tree search
Summary
A new autonomous system employs Large Language Model (LLM)-guided tree search to generate, evaluate, and optimize executable software for multi-pathogen disease forecasting. This system was prospectively evaluated during the 2025-2026 US respiratory season, autonomously discovering diverse models for influenza, COVID-19, and respiratory syncytial virus (RSV). The aggregated ensemble of these machine-generated models consistently matched or outperformed the human-curated Centers for Disease Control and Prevention (CDC) hub ensembles out-of-sample. The system also demonstrated success in data-scarce "cold start" scenarios for RSV. Retrospective ablations showed that optimizing log-scale distance metrics prevents reward hacking, and an automated judge-in-the-loop ensures structural fidelity to scientific theories, addressing the labor bottleneck in epidemiological modeling.
Key takeaway
For public health agencies and epidemiological modeling teams facing resource constraints, this autonomous LLM-guided forecasting system offers a path to rapidly deploy expert-level disease predictions at scale. You should consider integrating such automated model generation frameworks to overcome manual curation bottlenecks, especially for emerging pathogens or granular geographic resolutions, ensuring timely and accurate public health responses.
Key insights
LLM-guided tree search autonomously generates and optimizes disease forecasting models, matching human expert performance.
Principles
- Automated model generation reduces labor bottlenecks.
- Log-scale distance metrics prevent reward hacking.
- Judge-in-the-loop ensures theoretical fidelity.
Method
The system uses LLM-guided tree search to iteratively generate, evaluate, and optimize executable forecasting software, aggregating diverse models into an ensemble for improved accuracy.
In practice
- Apply LLM-guided search to code generation.
- Use log-scale metrics for optimization.
- Implement automated judges for structural validation.
Topics
- LLM-guided Tree Search
- Multi-pathogen Forecasting
- Autonomous System
- Public Health Surveillance
- Epidemiological Modeling
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.