How Good Can Linear Models Be for Time-Series Forecasting?
Summary
Lang Huang, Jinglue Xu, and Luke Darlow demonstrate that significant accuracy gains in time-series forecasting can be achieved by optimizing preprocessing rather than solely scaling model architectures. Using Ridge regression as a testbed, they explored context length, local normalization, regularization, and augmentation across eight standard benchmarks. Their findings reveal that optimal lookback is strongly series-specific, with fitted power-law exponents ranging from +0.46 on ETTm2 to -0.19 on Exchange and Traffic, challenging the notion that longer horizons always require more history. They also found that normalizing over a learned trailing fraction of the context is almost universally preferred, and the optimal degree of cross-series hyperparameter sharing varies. These optimized linear models outperform prior linear forecasters on most dataset-horizon entries and surpass Transformer, MLP, and CNN baselines on six of eight benchmarks.
Key takeaway
For Machine Learning Engineers optimizing time-series forecasting models, you should prioritize rigorous preprocessing tuning over immediately scaling to larger, more complex architectures. Focus on experimenting with series-specific lookback periods and implementing normalization over a learned trailing fraction of the context. This approach can yield superior performance, often exceeding Transformer or MLP baselines, while offering greater interpretability and lower computational cost.
Key insights
Optimizing preprocessing for linear models can achieve competitive time-series forecasting accuracy, often surpassing complex architectures.
Principles
- Optimal lookback is series-specific and non-monotonic.
- Trailing fraction normalization is generally superior.
- Hyperparameter sharing varies across series.
Method
The study used Ridge regression to search context length, local normalization, regularization, and augmentation on eight benchmarks, leveraging its closed-form solution for optimal hyperparameter identification.
In practice
- Experiment with learned trailing fraction normalization.
- Tailor lookback periods per series, not just horizon.
- Evaluate per-series vs. shared hyperparameters.
Topics
- Time-Series Forecasting
- Linear Models
- Ridge Regression
- Hyperparameter Tuning
- Data Preprocessing
- Model Performance Benchmarks
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.