Choosing ER Time Series Models (part 2): How to Fairly Compare ARIMA and XGBoost?

2026-03-13 · Source: Towards AI - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, medium

Summary

This article compares the performance of ARIMA and XGBoost models for forecasting emergency department patient arrivals, building on a previous analysis of ARIMA. The comparison focuses on ensuring fairness by engineering features for XGBoost to account for temporal dependencies, including lag features (lag 1 and lag 7), a 7-day rolling mean, and various calendar features (day of week, month, holidays, week of year). Both models were trained using an 80/20 temporal split of the data and evaluated on a hold-out test set using Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE). XGBoost showed a marginal improvement over ARIMA across all metrics, with MAE differing by 1 patient and MAPE by 1% on average.

Key takeaway

For data scientists comparing traditional time series models like ARIMA with machine learning models such as XGBoost, ensure a fair comparison by explicitly engineering temporal features for the ML model. Your approach should include lag features, rolling means, and calendar variables, and use a temporal train/test split. While XGBoost may offer marginal gains, consider if a 1-patient difference in daily predictions provides sufficient operational value to justify increased model complexity for hospital decision-makers.

Key insights

Fair comparison of time series models requires careful feature engineering and consistent evaluation metrics.

Principles

Temporal splitting preserves time series data correlation.
Feature engineering can adapt ML models for time series.
Multiple metrics offer comprehensive model performance insight.

Method

The method involves adding lag and rolling mean features to XGBoost, incorporating calendar features into both models, using an 80/20 temporal train/test split, and evaluating with MAE, RMSE, and MAPE.

In practice

Add lag features to capture autoregressive components.
Include rolling averages to smooth short-term fluctuations.
Incorporate calendar features for seasonal patterns.

Topics

Emergency Department Forecasting
Time Series Models
ARIMA
XGBoost
Feature Engineering

Code references

datawithclarity/Medium

Best for: Machine Learning Engineer, Data Scientist, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.