Machine Learning and the Random Walk Puzzle: Forecasting the CAD/USD Exchange Rate with Expanding Window Evaluation and SHAP Interpretability
Summary
A study investigated whether machine learning (ML) models could surpass the naive random walk benchmark in forecasting the monthly USD/CAD exchange rate. Utilizing daily data from the Bank of Canada, spanning January 2017 to May 2026 and resampled into 113 monthly observations, five ML models—linear regression, random forest, gradient boosting, XGBoost, and AdaBoost—were evaluated. These models were benchmarked against the naive random walk and exponential smoothing with Holt-Winters seasonality (ETS) using an expanding-window framework for out-of-sample integrity. Forecast-accuracy differences were assessed via the Diebold-Mariano (DM) test. Structural break detection identified four breakpoints in 2018, 2020, 2022, and 2024. Results showed that only linear regression statistically outperformed the random walk, with a DM statistic of 3.0585 and a p-value of 0.0071. Random Forest achieved the lowest MAPE of 1.17 percent among ML models. SHAP analysis confirmed that short-term lags, specifically lag1 and lag2, and recent rolling means were dominant predictors, aligning with the near-random-walk behavior of exchange rates.
Key takeaway
For data scientists developing exchange rate forecasting models, recognize the persistent strength of the naive random walk benchmark. While linear regression showed marginal statistical outperformance for USD/CAD, complex ensemble ML models offered little advantage. You should prioritize robust evaluation methods like expanding-window frameworks and use SHAP to understand model drivers, especially the influence of short-term lags, before deploying sophisticated ML solutions.
Key insights
Machine learning models generally struggle to outperform the naive random walk in USD/CAD exchange rate forecasting.
Principles
- The naive random walk is a formidable benchmark.
- Expanding-window evaluation ensures out-of-sample integrity.
- Short-term lags dominate exchange rate predictions.
Method
Evaluate ML models against random walk and ETS using an expanding-window framework, assess with Diebold-Mariano test, and interpret with SHAP analysis.
In practice
- Apply linear regression for potential marginal gains.
- Use SHAP to interpret ML model drivers.
- Consider structural breaks in time series analysis.
Topics
- Machine Learning
- Exchange Rate Forecasting
- Random Walk Model
- SHAP Interpretability
- Time Series Analysis
- Financial Econometrics
Best for: Research Scientist, AI Scientist, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.