RW #10 - Predicting Customer Lifetime Value for a Telecom Company
Summary
A telecom company implemented a Customer Lifetime Value (LTV) prediction project to forecast 24-month LTV for new customers within their first 60 days. The objective is to optimize acquisition spending and retention efforts. LTV is defined as total plan and device upgrade revenue minus costs over two years. The project emphasizes rigorous data preparation, including removing leakage and using cohort-based temporal splits, which revealed a more realistic R² of 0.54 compared to 0.71 from random splits. Key features like multi-SIM status (£680 vs. £240 LTV) and acquisition channel (own-website customers at £340 LTV vs. £95 for price comparison sites) were identified. A LightGBM model, often with a two-stage approach for high-value segments, achieved a Spearman rank correlation of 0.65, outperforming a BG/NBD baseline of 0.55. Walk-forward validation was crucial for monitoring model stability and detecting impacts from external changes like new pricing tiers.
Key takeaway
For Data Scientists building Customer Lifetime Value models, prioritize rigorous data preparation, especially using cohort-based temporal splits and strict leakage prevention, to ensure realistic performance metrics. Focus on engineering features from early customer behavior, like multi-SIM status or engagement velocity, which are highly predictive within 60 days. Implement walk-forward validation to monitor model stability and adapt to real-world changes like pricing shifts, ensuring your predictions remain actionable for optimizing acquisition and retention strategies.
Key insights
Early customer behavior signals predict long-term value, enabling smarter acquisition and retention investments.
Principles
- Define LTV with finance before coding.
- Use cohort-based temporal data splits.
- Strictly avoid data leakage by cutoff calendar.
Method
The project follows a full data science lifecycle: problem definition, data gathering, cleaning (leakage, winsorizing, cohort splits), EDA, feature engineering (RFM, engagement velocity), modeling (LightGBM, two-stage), training, evaluation, deployment, and monitoring.
In practice
- Engineer a "Multi-SIM Flag" feature.
- Track engagement velocity (data usage change).
- Incorporate network quality by postcode.
Topics
- Customer Lifetime Value
- Telecom Analytics
- Machine Learning Lifecycle
- Feature Engineering
- Data Leakage
- Walk-Forward Validation
Best for: Data Scientist, Machine Learning Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning Pills.