The subscription business was losing 6% of its customers every month. That’s a 72% annual churn rate — for every 100 customers acquired in January, only 28 would still be paying by December.
The marketing team was running retention campaigns, but they were blasting everyone with the same discount offer. High-value loyal customers got annoyed. At-risk customers got the same generic email as everyone else. The campaigns were expensive and barely moved the needle.
They needed to know which customers were about to leave and why — before it happened.
The Data
We had three years of transaction history for 180,000 customers. The raw data included:
- Subscription events — Sign-up date, plan tier, upgrades, downgrades, cancellations
- Transaction history — Purchase frequency, average order value, recency of last purchase
- Engagement signals — Login frequency, feature usage, support ticket count, email open rates
- Demographics — Company size (B2B), industry, geographic region
The target variable was simple: did the customer cancel within the next 30 days? Binary classification.
Feature Engineering
Raw data doesn’t predict churn. Features do. This phase took longer than model training — and had more impact on accuracy.
RFM Features (Recency, Frequency, Monetary)
The classic customer segmentation framework, computed per customer:
- Recency — Days since last purchase (higher = more at risk)
- Frequency — Number of transactions in the last 90 days
- Monetary — Total spend in the last 90 days
Behavioral Trends
Static snapshots miss the story. A customer who logged in 20 times last month but only 3 times this month is exhibiting a declining trend — even though their current activity looks okay in isolation.
We computed rolling trends:
- Login velocity — 7-day login count vs. 30-day average (ratio < 0.5 = declining engagement)
- Feature adoption drop — Did they stop using a feature they previously used weekly?
- Support escalation — Increasing ticket frequency or severity over time
- Payment friction — Failed payment attempts, late payments, downgrade requests
Interaction Features
Some combinations are more predictive than individual features:
- High spend + declining engagement — The “silent churner” who’s still paying but has mentally checked out
- Recent support ticket + low satisfaction score — Unresolved frustration
- Long tenure + first-ever downgrade — A loyal customer sending a warning signal
We created ~60 features total. About 20 ended up mattering.
Model Selection: Why XGBoost
We tested four algorithms:
| Model | AUC-ROC | Precision@10% | Training Time |
|---|---|---|---|
| Logistic Regression | 0.78 | 0.52 | 2 sec |
| Random Forest | 0.83 | 0.61 | 45 sec |
| XGBoost | 0.87 | 0.68 | 12 sec |
| Neural Network (MLP) | 0.85 | 0.64 | 3 min |
XGBoost won on accuracy and training speed. The neural network was slightly worse despite being 15x slower to train. For tabular data with engineered features, gradient boosting consistently outperforms deep learning — this project was no exception.
Why Precision@10% Matters More Than AUC
AUC-ROC tells you how well the model separates churners from non-churners overall. But the marketing team doesn’t target everyone — they target the top 10% most at-risk customers.
Precision@10% answers the real business question: “If we send retention offers to our top 10% most at-risk customers, what percentage of them are actually going to churn?”
At 68% precision, 7 out of 10 customers who received retention offers were genuinely at risk. Good enough to justify the campaign spend.
Model Interpretation
A prediction without explanation is useless to the business. “Customer X has a 73% churn probability” doesn’t tell the marketing team what to do about it.
We used SHAP (SHapley Additive exPlanations) to explain every prediction:
- Top churn driver for Customer A: Login frequency dropped 80% in the last 2 weeks
- Top churn driver for Customer B: Filed 3 support tickets in 7 days, none resolved
- Top churn driver for Customer C: Downgraded plan tier after 18 months of premium subscription
This let the retention team personalize their outreach. Customer A got a “we miss you” re-engagement email. Customer B got a priority support escalation. Customer C got a call from their account manager.
The Retention Dashboard
The model ran weekly in a Python batch job. Results were pushed to a Tableau dashboard that the marketing team used every Monday:
Risk Tier View — All customers segmented into High Risk (>70% churn probability), Medium (40-70%), and Low (<40%). The High Risk segment was the action list.
Churn Driver Breakdown — For each at-risk customer, the top 3 SHAP features explaining their score. This powered personalized outreach.
Cohort Trends — Monthly churn rate by signup cohort, plan tier, and acquisition channel. This identified systemic issues — e.g., customers acquired through a specific ad campaign had 2x higher churn because the campaign over-promised features that didn’t exist.
Campaign Tracker — After retention campaigns went out, the dashboard tracked whether targeted customers actually retained vs. a control group. This closed the feedback loop.
Results
After 3 months of model-driven retention:
- Monthly churn rate: 6.0% → 4.2% (30% reduction)
- Revenue saved: ~$340K in annual recurring revenue retained
- Campaign efficiency: 3.5x higher retention rate among model-targeted customers vs. untargeted campaigns
- False positive rate: Only 32% of “at-risk” customers were false alarms — acceptable for the campaign cost
What Didn’t Work
Predicting churn timing. We tried a survival analysis model (Cox proportional hazards) to predict when a customer would churn, not just whether. The predictions were too noisy to be actionable — “this customer will churn sometime in the next 2-8 weeks” isn’t helpful for campaign timing. The binary 30-day model was simpler and more useful.
Deep learning on tabular data. The neural network added complexity without adding accuracy. For structured, feature-engineered data, XGBoost remains king.
The Stack
| Component | Tool | Why |
|---|---|---|
| Feature Engineering | Python, Pandas | Flexible, fast iteration on 60+ features |
| Model Training | XGBoost, scikit-learn | Best accuracy on tabular data, fast training |
| Model Interpretation | SHAP | Per-customer churn explanations |
| Dashboard | Tableau | Marketing team’s preferred BI tool |
| Batch Pipeline | Python + cron | Weekly model scoring, CSV export to Tableau |
Simba Hu helps companies make better decisions with data and AI — from strategy to implementation. Based in Tokyo, serving clients globally. Book a strategy call or visit simbahu.com.