Building a Customer Churn Prediction Model with XGBoost

The subscription business was losing 6% of its customers every month. That’s a 72% annual churn rate — for every 100 customers acquired in January, only 28 would still be paying by December.

The marketing team was running retention campaigns, but they were blasting everyone with the same discount offer. High-value loyal customers got annoyed. At-risk customers got the same generic email as everyone else. The campaigns were expensive and barely moved the needle.

They needed to know which customers were about to leave and why — before it happened.

The Data

We had three years of transaction history for 180,000 customers. The raw data included:

Subscription events — Sign-up date, plan tier, upgrades, downgrades, cancellations
Transaction history — Purchase frequency, average order value, recency of last purchase
Engagement signals — Login frequency, feature usage, support ticket count, email open rates
Demographics — Company size (B2B), industry, geographic region

The target variable was simple: did the customer cancel within the next 30 days? Binary classification.

Feature Engineering

Raw data doesn’t predict churn. Features do. This phase took longer than model training — and had more impact on accuracy.

RFM Features (Recency, Frequency, Monetary)

The classic customer segmentation framework, computed per customer:

Recency — Days since last purchase (higher = more at risk)
Frequency — Number of transactions in the last 90 days
Monetary — Total spend in the last 90 days

Behavioral Trends

Static snapshots miss the story. A customer who logged in 20 times last month but only 3 times this month is exhibiting a declining trend — even though their current activity looks okay in isolation.

We computed rolling trends:

Login velocity — 7-day login count vs. 30-day average (ratio < 0.5 = declining engagement)
Feature adoption drop — Did they stop using a feature they previously used weekly?
Support escalation — Increasing ticket frequency or severity over time
Payment friction — Failed payment attempts, late payments, downgrade requests

Interaction Features

Some combinations are more predictive than individual features:

High spend + declining engagement — The “silent churner” who’s still paying but has mentally checked out
Recent support ticket + low satisfaction score — Unresolved frustration
Long tenure + first-ever downgrade — A loyal customer sending a warning signal

We created ~60 features total. About 20 ended up mattering.

Model Selection: Why XGBoost

We tested four algorithms:

Model	AUC-ROC	Precision@10%	Training Time
Logistic Regression	0.78	0.52	2 sec
Random Forest	0.83	0.61	45 sec
XGBoost	0.87	0.68	12 sec
Neural Network (MLP)	0.85	0.64	3 min

XGBoost won on accuracy and training speed. The neural network was slightly worse despite being 15x slower to train. For tabular data with engineered features, gradient boosting consistently outperforms deep learning — this project was no exception.

Why Precision@10% Matters More Than AUC

AUC-ROC tells you how well the model separates churners from non-churners overall. But the marketing team doesn’t target everyone — they target the top 10% most at-risk customers.

Precision@10% answers the real business question: “If we send retention offers to our top 10% most at-risk customers, what percentage of them are actually going to churn?”

At 68% precision, 7 out of 10 customers who received retention offers were genuinely at risk. Good enough to justify the campaign spend.

Model Interpretation

A prediction without explanation is useless to the business. “Customer X has a 73% churn probability” doesn’t tell the marketing team what to do about it.

We used SHAP (SHapley Additive exPlanations) to explain every prediction:

Top churn driver for Customer A: Login frequency dropped 80% in the last 2 weeks
Top churn driver for Customer B: Filed 3 support tickets in 7 days, none resolved
Top churn driver for Customer C: Downgraded plan tier after 18 months of premium subscription

This let the retention team personalize their outreach. Customer A got a “we miss you” re-engagement email. Customer B got a priority support escalation. Customer C got a call from their account manager.

The Retention Dashboard

The model ran weekly in a Python batch job. Results were pushed to a Tableau dashboard that the marketing team used every Monday:

Risk Tier View — All customers segmented into High Risk (>70% churn probability), Medium (40-70%), and Low (<40%). The High Risk segment was the action list.

Churn Driver Breakdown — For each at-risk customer, the top 3 SHAP features explaining their score. This powered personalized outreach.

Cohort Trends — Monthly churn rate by signup cohort, plan tier, and acquisition channel. This identified systemic issues — e.g., customers acquired through a specific ad campaign had 2x higher churn because the campaign over-promised features that didn’t exist.

Campaign Tracker — After retention campaigns went out, the dashboard tracked whether targeted customers actually retained vs. a control group. This closed the feedback loop.

Results

After 3 months of model-driven retention:

Monthly churn rate: 6.0% → 4.2% (30% reduction)
Revenue saved: ~$340K in annual recurring revenue retained
Campaign efficiency: 3.5x higher retention rate among model-targeted customers vs. untargeted campaigns
False positive rate: Only 32% of “at-risk” customers were false alarms — acceptable for the campaign cost

What Didn’t Work

Predicting churn timing. We tried a survival analysis model (Cox proportional hazards) to predict when a customer would churn, not just whether. The predictions were too noisy to be actionable — “this customer will churn sometime in the next 2-8 weeks” isn’t helpful for campaign timing. The binary 30-day model was simpler and more useful.

Deep learning on tabular data. The neural network added complexity without adding accuracy. For structured, feature-engineered data, XGBoost remains king.

The Stack

Component	Tool	Why
Feature Engineering	Python, Pandas	Flexible, fast iteration on 60+ features
Model Training	XGBoost, scikit-learn	Best accuracy on tabular data, fast training
Model Interpretation	SHAP	Per-customer churn explanations
Dashboard	Tableau	Marketing team’s preferred BI tool
Batch Pipeline	Python + cron	Weekly model scoring, CSV export to Tableau

Simba Hu helps companies make better decisions with data and AI — from strategy to implementation. Based in Tokyo, serving clients globally. Book a strategy call or visit simbahu.com.