Skip to content

Amogh-2005/Customer_churn_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📉 Customer Churn Prediction App

Overview

This application predicts the likelihood of customer churn (leaving the service) based on their contract details, service usage, and support features.It also categorizes customers into churn risk bands to support better business decision-making.

Features

  • Interactive Interface: User-friendly dashboard built with Streamlit.
  • Data Input: Easy-to-use sliders and dropdowns for inputting customer data (Tenure, Contract, Monthly Charges, etc.).
  • Real-time Prediction: Instantly calculates the probability of churn.
  • Risk Categorization: Classifies customers into Low, Medium, or High risk levels.
  • Actionable Recommendations: Suggests targeted retention strategies based on the risk profile.

Dataset

Source: Kaggle – Telco Customer Churn Dataset Rows: ~7,000 customers Target Variable: Churn

Features include:

  • Contract type
  • Tenure
  • Monthly charges
  • Internet service
  • Online security
  • Tech support
  • Payment method
  • Billing preferences

Machine Learning Models Used

The following models were trained and compared:

  • Logistic Regression
  • Decision Tree Classifier
  • Random Forest Classifier (selected)

Why Random Forest?

  • Handles both numerical and categorical features well.
  • Captures non-linear relationships.
  • Provided the best recall for churn customers.

Model Evaluation

Since churn prediction is a business-critical problem, the focus was on:

  • Recall (Churn = Yes)
    • → Missing a churner is more costly than contacting a non-churner.

Final Random Forest performance (after tuning):

  • Recall (Churn): ~66%
  • Balanced trade-off between false positives and false negatives.

Custom Decision Threshold

  • Default ML threshold (0.5) was not suitable due to class imbalance.
  • A custom threshold of 0.37 was chosen to:
    • Improve recall.
    • Reduce false negatives.
    • Align with business priorities.

Tech Stack

  • Python: Core programming language for the project.
  • Pandas, NumPy: Used for data manipulation, cleaning, and numerical computations.
  • Matplotlib, Seaborn: Libraries for data visualization and exploratory analysis.
  • Scikit-learn: Used for building, training, and evaluating the machine learning model.
  • Streamlit: Framework for building the interactive web dashboard.
  • Jupyter Notebook: Environment for data exploration, analysis, and model prototyping.

Installation & Setup

  1. Prerequisites: Ensure you have Python installed.

  2. Set up the environment: It is recommended to use a virtual environment.

    # Create virtual environment
    python -m venv venv
    
    # Activate virtual environment
    # On Windows:
    .\venv\Scripts\activate
    # On macOS/Linux:
    source venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt

Usage

  1. Run the application:

    streamlit run app.py
  2. Interact with the app:

    • Enter customer information such as tenure, monthly charges, and services.
    • Click the "Predict Churn Risk" button.
    • View the calculated churn probability and recommended actions.

Project Structure

churn_project/
│
├── data/
│   └── Telco-Customer-Churn.csv
│
├── notebooks/
│   └── analysis.ipynb
│   
├── model/
│   └── churn_model.pkl
│
├── app.py
│
├── requirements.txt
│
└── README.md

About

End-to-end customer churn prediction using machine learning, featuring EDA, model comparison, threshold tuning, risk band classification, and Streamlit deployment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors