Understanding Key Features of Time Series Foundation Models from Epidemic Forecasting

2026-06-17 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Epidemic Forecasting · Depth: Advanced, quick

Summary

A systematic evaluation of regional influenza forecasting characterizes the comparative behavior of modern forecasting architectures using influenza-like illness surveillance and influenza-associated hospitalization time series. The study compared classical neural networks, numerical transformer-based models, pretrained time series foundation models, and LLM-based forecasting approaches for 1-4-week-ahead prediction. Findings indicate that a mixture-of-experts model, fusing multiple pretrained forecasters, achieves the strongest overall performance due to complementary predictive information from heterogeneous representations. Numerical transformer-based models produce reliable forecasts, with pretraining offering the largest gains at longer horizons, especially when the pretraining domain aligns mechanistically with influenza dynamics. LLM-based methods underperform numerical forecasters in this context. Hospitalization signals provide complementary improvements in selected settings, clarifying when additional surveillance streams enhance multi-horizon forecasting robustness.

Key takeaway

For public health analysts developing epidemic forecasting systems, prioritize a mixture-of-experts approach combining diverse pretrained models. Focus on numerical transformer-based architectures, as LLM-based methods underperform in this context. Leverage pretraining, particularly from mechanistically aligned domains, to improve accuracy for longer prediction horizons (1-4 weeks ahead) and consider integrating hospitalization data as an auxiliary signal to enhance multi-horizon forecasting robustness.

Key insights

Heterogeneous pretrained representations improve epidemic forecasting, especially with aligned pretraining domains.

Principles

Mixture-of-experts enhances forecast accuracy.
Pretraining benefits longer forecasting horizons.
Domain alignment is crucial for pretraining gains.

Method

Systematic evaluation of diverse time series models (NNs, Transformers, FMs, LLMs) on influenza data under temporal and spatial generalization for 1-4-week-ahead prediction.

In practice

Fuse multiple pretrained forecasters for robustness.
Prioritize numerical transformers over LLMs for epidemic data.
Utilize hospitalization data as auxiliary covariate.

Topics

Time Series Forecasting
Epidemic Forecasting
Influenza Surveillance
Foundation Models
Transformer Models
Mixture-of-Experts

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.