Customer Churn Prediction on Structured Data Using FT-Transformer and Stacking Ensembles

· Source: cs.LG updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

A new hybrid architecture for customer churn prediction on structured tabular data integrates feature-tokenized transformers (FT-Transformer) with gradient-boosted trees (XGBoost) via calibration-aware stacking. This framework addresses challenges like class imbalance using weighted loss functions and leverages out-of-fold stacking with a logistic regression meta-learner to recalibrate base model outputs. Tested on a public bank churn dataset of 10,000 customers with a 20% churn rate, the model achieved 62.10% F1, 0.861 AUC-ROC, and 0.647 PR-AUC. It statistically significantly outperformed the Multi-Layer Perceptron (MLP) baseline by 3.37 F1 points (p < 0.001) and 0.027 AUC. Ablation studies confirmed that both the transformer component and the stacking strategy materially contribute to its robust performance.

Key takeaway

For data scientists building churn prediction models, you should consider this hybrid FT-Transformer and XGBoost stacking ensemble. It significantly improves F1-score and AUC while providing well-calibrated probabilities crucial for cost-sensitive interventions. Implement class-weighted loss and out-of-fold stacking to enhance performance and ensure robust, reproducible results on imbalanced tabular datasets. This approach offers a strong balance of accuracy and interpretability for your retention strategies.

Key insights

Hybrid models combining transformers and tree ensembles offer superior, calibrated churn prediction on tabular data.

Principles

Method

The method involves preprocessing data, training FT-Transformer and XGBoost base models with class-weighted loss, generating out-of-fold predictions, and training a logistic regression meta-learner on these predictions.

In practice

Topics

Best for: AI Engineer, Research Scientist, Machine Learning Engineer, Data Scientist, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.