Forecast collapse of transformer-based models under squared loss in financial time series
Summary
A study by Pierre Andreoletti forecasts the collapse of Transformer-based models when applied to financial time series under squared loss. The research highlights that in regimes where the conditional expectation of future trajectories is effectively degenerate, such as in standard financial settings, the Bayes-optimal predictor becomes trivial (flat for prices, zero for returns). In these scenarios, increasing model expressivity, like with Transformers, does not enhance predictive accuracy. Instead, it introduces spurious fluctuations around the optimal predictor due to noise reuse, leading to increased prediction variance without bias reduction. Numerical experiments using high-frequency EUR/USD exchange rate data support these theoretical findings, demonstrating that Transformer models produce larger forecasting errors than a simple linear benchmark across most forecasting windows, consistent with a variance-driven degradation mechanism.
Key takeaway
For AI Engineers developing financial forecasting models, you should reconsider using Transformer-based architectures for time series with weak conditional structure, especially under squared loss. Your models may suffer from increased prediction variance and larger errors compared to simpler linear benchmarks, as high expressivity can amplify noise rather than capture signal. Focus on model parsimony and robust error analysis to avoid performance degradation.
Key insights
Highly expressive models like Transformers degrade on financial time series due to increased variance from noise reuse.
Principles
- Increased model expressivity does not improve accuracy in degenerate conditional expectation regimes.
- Noise reuse introduces spurious fluctuations and increased prediction variance.
Method
The study combines classical characterization of squared-loss risk minimization with numerical experiments on high-frequency EUR/USD exchange rate data to analyze trajectory-level forecasting errors.
In practice
- Avoid complex models for financial time series with weak conditional structure.
- Prioritize simple linear models for financial forecasting under squared loss.
Topics
- Transformer Models
- Financial Time Series
- Squared Loss
- Trajectory Forecasting
- Prediction Variance
Best for: AI Engineer, Machine Learning Engineer, AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.