EpiEvolve: Self-Evolving Agents for Streaming Pandemic Forecasting under Regime Shifts
Summary
EpiEvolve is a self-evolving agent designed for streaming pandemic forecasting, specifically addressing the challenges of real-world operational environments with delayed labels and disease regime shifts. It wraps a pre-trained LLM forecaster, such as Qwen3-14B-Base, keeping its weights fixed while adapting through a hierarchical episodic memory. This memory stores past forecast outcomes, reflects on delayed ground truth, retrieves cases relevant to the current epidemiological regime, and distills recurring errors into strategic rules. Evaluated on weekly COVID-19 hospitalization trend forecasting across five variant regimes, EpiEvolve achieved 0.629 average accuracy, significantly surpassing its static backbone (0.561) and the external CDC ensemble (0.325). Crucially, it reduced recovery lag after regime shifts from 5 weeks to 2 weeks, demonstrating robust adaptation without costly gradient updates.
Key takeaway
For Machine Learning Engineers deploying LLM-based forecasters in dynamic environments, you should prioritize memory-based adaptation over continuous parameter fine-tuning. EpiEvolve demonstrates that a frozen LLM backbone, augmented with hierarchical episodic memory and strategic rule distillation, significantly improves accuracy and reduces recovery lag after regime shifts. This approach offers a robust and cost-effective alternative to retraining, ensuring your models remain performant as data distributions evolve.
Key insights
LLM forecasters can adapt to streaming data and regime shifts via memory-based self-evolution, not parameter updates.
Principles
- Memory-based adaptation outperforms static models in streaming contexts.
- Hierarchical episodic memory enhances regime-aware retrieval.
- Distilling errors into rules improves post-shift recovery.
Method
EpiEvolve uses a frozen LLM backbone, updates hierarchical episodic memory with reflections on delayed labels, retrieves regime-relevant cases, and distills errors into strategic rules for context.
In practice
- Implement hierarchical memory for LLM agents in streaming tasks.
- Use reflection and rule distillation to adapt to concept drift.
- Prioritize memory updates over parameter fine-tuning for deployment.
Topics
- Pandemic Forecasting
- LLM Agents
- Concept Drift Adaptation
- Episodic Memory
- Streaming Machine Learning
- COVID-19
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.