Skip to content

ibtesaamaslam/Fraud-Detection-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

46a03974-6c15-41a9-85f0-c2d73092ea7c

πŸ’³ Fraud Detection AI β€” End-to-End ML System with SHAP + Hugging Face Explainability

Python Scikit-Learn XGBoost SHAP Streamlit License

A production-style, end-to-end machine learning pipeline for detecting fraudulent credit card transactions β€” featuring a 4-model benchmark, SMOTE class balancing, SHAP per-prediction explainability, DistilGPT-2 plain-English summaries, and an interactive Streamlit dashboard.

πŸ”— View Repository Β· πŸ› Report Bug Β· ✨ Request Feature


πŸ“‹ Table of Contents


πŸ“Œ Project Overview

Credit card fraud costs the global financial system billions of dollars annually. Fraudulent transactions are rare β€” typically less than 0.2% of all activity β€” which makes detection extremely difficult using standard machine learning approaches. A naive model that simply predicts "legitimate" for every transaction would achieve over 99% accuracy while catching zero fraud cases.

This project builds a complete, production-style AI pipeline that:

  • Ingests raw transaction data from the Kaggle Credit Card Fraud dataset
  • Engineers meaningful features from raw inputs
  • Handles the severe class imbalance using SMOTE (Synthetic Minority Oversampling Technique)
  • Trains and benchmarks four machine learning models side by side
  • Evaluates them using metrics designed for imbalanced classification
  • Explains every prediction using SHAP (SHapley Additive exPlanations)
  • Converts technical SHAP output into plain English using a Hugging Face DistilGPT-2 language model
  • Presents everything through an interactive Streamlit dashboard with three input modes
  • Auto-generates a full Markdown training report after every run

✨ Key Features

Feature Description
Multi-model training Logistic Regression Β· Random Forest Β· XGBoost Β· LightGBM
Class imbalance handling SMOTE resampling before training (configurable ratio)
Rigorous evaluation ROC-AUC Β· PR-AUC Β· Precision Β· Recall Β· F1
SHAP explainability Per-prediction feature contribution breakdown
Human-readable explanations DistilGPT-2 narrates the SHAP output in plain English
Interactive UI Streamlit dashboard β€” Manual Β· Random Β· CSV input modes
Reproducible pipeline Seeded RNGs Β· saved scaler Β· saved feature name order
Centralized config All paths and hyperparameters in one config.py
Automated reporting Markdown report + metrics CSV generated after every run

πŸ”„ 8-Stage Pipeline

Run the entire pipeline with a single command: python main.py

Raw CSV
β”‚
β–Ό
Stage 1 β€” Load Data
  Reads data/raw/transactions.csv Β· validates shape Β· logs fraud rate
β”‚
β–Ό
Stage 2 β€” EDA
  Class balance chart Β· amount histogram Β· time histogram Β· correlation heatmap
  β†’ All charts saved to reports/figures/
β”‚
β–Ό
Stage 3 β€” Preprocessing
  Feature engineering (Amount_Log, Hour, Is_Large_Amount)
  β†’ train/test split (80/20, stratified) β†’ StandardScaler
  β†’ saves scaler.pkl + feature_names.json to data/processed/
β”‚
β–Ό
Stage 4 β€” SMOTE Resampling
  Resamples fraud class to SMOTE_RATIO (default 0.2) of majority
  β†’ prevents dominant legitimate class from biasing all models
β”‚
β–Ό
Stage 5 β€” Model Training
  Trains all 4 models on SMOTE-resampled training data
  β†’ saves each model to models/
β”‚
β–Ό
Stage 6 β€” Evaluation
  Scores all models: ROC-AUC Β· PR-AUC Β· Precision Β· Recall Β· F1
  β†’ ranks by ROC-AUC Β· saves best model to models/fraud_model.pkl
  β†’ saves confusion matrix, ROC, PR, feature importance charts per model
β”‚
β–Ό
Stage 7 β€” Explainability
  SHAP values computed for best model
  β†’ beeswarm summary plot saved β†’ DistilGPT-2 plain-English narration
β”‚
β–Ό
Stage 8 β€” Report
  Writes reports/report.md with full results table
  Writes reports/metrics.csv for downstream analysis

πŸ“‚ Project Structure

fraud-detection-ai/
β”‚
β”œβ”€β”€ app/
β”‚   └── streamlit_app.py            # Interactive Streamlit dashboard
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/
β”‚   β”‚   └── transactions.csv        # ← Place Kaggle dataset here
β”‚   β”œβ”€β”€ processed/
β”‚   β”‚   β”œβ”€β”€ X_train.csv             # Scaled training features
β”‚   β”‚   β”œβ”€β”€ X_test.csv              # Scaled test features
β”‚   β”‚   β”œβ”€β”€ y_train.csv             # Training labels
β”‚   β”‚   β”œβ”€β”€ y_test.csv              # Test labels
β”‚   β”‚   β”œβ”€β”€ scaler.pkl              # Fitted StandardScaler
β”‚   β”‚   └── feature_names.json      # Ordered feature column names
β”‚   └── external/
β”‚       └── huggingface_cache/      # Cached HF model weights
β”‚
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ fraud_model.pkl             # Best model (selected by ROC-AUC)
β”‚   β”œβ”€β”€ random_forest.pkl
β”‚   β”œβ”€β”€ xgboost_model.pkl
β”‚   └── lightgbm_model.pkl
β”‚
β”œβ”€β”€ notebooks/                      # Jupyter notebooks for exploration
β”‚
β”œβ”€β”€ reports/
β”‚   β”œβ”€β”€ figures/                    # All generated charts
β”‚   β”œβ”€β”€ report.md                   # Auto-generated training report
β”‚   └── metrics.csv                 # Model comparison table
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   β”œβ”€β”€ load_data.py            # Loads raw CSV from disk
β”‚   β”‚   β”œβ”€β”€ preprocess.py           # Cleaning, splitting, and scaling
β”‚   β”‚   └── feature_engineering.py  # Derives smart features from raw columns
β”‚   β”‚
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ train_model.py          # Trains all candidate models
β”‚   β”‚   β”œβ”€β”€ evaluate_model.py       # Scores and ranks all models
β”‚   β”‚   β”œβ”€β”€ predict.py              # Single-transaction prediction pipeline
β”‚   β”‚   └── huggingface_model.py    # Hugging Face plain-English explanation
β”‚   β”‚
β”‚   β”œβ”€β”€ explainability/
β”‚   β”‚   └── shap_explainer.py       # SHAP values and summary plots
β”‚   β”‚
β”‚   β”œβ”€β”€ visualization/
β”‚   β”‚   β”œβ”€β”€ eda.py                  # Exploratory data analysis charts
β”‚   β”‚   └── plots.py                # Confusion matrix, ROC, PR, importance
β”‚   β”‚
β”‚   └── utils/
β”‚       β”œβ”€β”€ config.py               # All paths and hyperparameters
β”‚       └── helpers.py              # Shared utility functions
β”‚
β”œβ”€β”€ main.py                         # Pipeline entry point
β”œβ”€β”€ requirements.txt                # Python dependencies
β”œβ”€β”€ project_structure.md            # Extended structure documentation
β”œβ”€β”€ workflow.md                     # Pipeline workflow documentation
└── README.md

πŸ“¦ Dataset

This project uses the Credit Card Fraud Detection dataset published by the Machine Learning Group at UniversitΓ© Libre de Bruxelles (ULB).

Download: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

After downloading, rename and place at: data/raw/transactions.csv

Property Value
Total Transactions 284,807
Fraudulent Transactions 492 (0.17%)
Legitimate Transactions 284,315 (99.83%)
Raw Features 30 (Time Β· V1–V28 PCA Β· Amount)
Target Column Class (0 = legitimate Β· 1 = fraud)

Note: The V1–V28 columns are PCA-transformed by the dataset authors to protect cardholder privacy. Original feature names are not available.


πŸ”§ Feature Engineering

Three smart features are derived from the raw dataset in src/data/feature_engineering.py:

Feature Source Description
Amount_Log Amount Log-transformed transaction amount β€” handles the wide right-skewed distribution
Hour Time Hour of day extracted from the Unix-style Time column (0–23)
Is_Large_Amount Amount Binary flag: 1 if Amount > $200 (configurable via LARGE_AMOUNT_THRESHOLD)

πŸ€– Models Trained

All models are trained on SMOTE-resampled data where the fraud class is resampled to 20% of the majority class size (SMOTE_RATIO = 0.2).

Model Key Settings Class Imbalance Strategy
Logistic Regression max_iter=2000 class_weight=balanced
Random Forest n_estimators=200 class_weight=balanced_subsample
XGBoost n_estimators=200 Β· learning_rate=0.05 Β· max_depth=5 SMOTE pre-training
LightGBM n_estimators=200 Β· learning_rate=0.05 SMOTE pre-training

πŸ“Š Evaluation Metrics

Standard accuracy is deliberately excluded from model selection β€” a model predicting "legitimate" for every transaction would achieve 99.83% accuracy while catching zero fraud.

Metric Purpose Used For
ROC-AUC Ranking quality across all thresholds βœ… Primary model selection
PR-AUC Best metric for severely imbalanced datasets βœ… Model selection
Precision Of all flagged transactions, how many were actual fraud βœ… Business impact
Recall Of all actual fraud cases, how many were caught βœ… Business impact
F1 Score Harmonic mean of precision and recall βœ… Balanced comparison
Accuracy Reference only ❌ Not used for selection

πŸ” SHAP Explainability

Fraud detection systems must be explainable β€” regulators and end users need to understand why a specific transaction was flagged.

SHAP assigns each feature a contribution score for every individual prediction:

  • Positive score β†’ pushes the model toward predicting fraud
  • Negative score β†’ pushes the model toward predicting legitimate

Example SHAP output:

V14           (-2.341) β†’ pushes toward safe     🟒
Amount_Log    (+1.823) β†’ pushes toward fraud     πŸ”΄
Hour          (+0.912) β†’ pushes toward fraud     πŸ”΄
V17           (-0.748) β†’ pushes toward safe      🟒
V4            (+0.631) β†’ pushes toward fraud     πŸ”΄

Generated SHAP outputs:

  • Per-prediction top feature contributions (shown in the Streamlit dashboard)
  • Global beeswarm summary plot saved to reports/figures/shap_summary.png

πŸ€– Hugging Face Plain-English Explanations

The SHAP feature summary is passed as a structured prompt to DistilGPT-2, which generates a short, beginner-friendly explanation of the model's decision.

Example output:

"This transaction looks suspicious because the amount is unusually large for
this time of day, and several anonymised signals are elevated. The model
estimated a fraud risk of 0.91."

If the Hugging Face model is unavailable (no internet or download fails), a clean, readable static fallback explanation is returned automatically.


πŸ“Š Streamlit Dashboard

Once main.py has completed at least once, launch the interactive dashboard:

streamlit run app/streamlit_app.py

Three input modes:

Mode Description Best For
Manual Type raw transaction values into the form directly Testing specific or hypothetical transactions
Random Sample Picks a random row from the test dataset Quick demo on real data
Upload CSV Upload a single-row CSV file Integration testing

Every prediction shows:

  • Fraud probability score (0.00 – 1.00)
  • Risk level: Low / Medium / High
  • Pass βœ… or Fail 🚨 banner
  • DistilGPT-2 plain-English explanation
  • Top SHAP feature contributions ranked by absolute impact

πŸ“ Generated Outputs

After running main.py, the following are created automatically:

File Description
data/processed/X_train.csv Scaled training feature matrix
data/processed/X_test.csv Scaled test feature matrix
data/processed/scaler.pkl Fitted StandardScaler for inference
data/processed/feature_names.json Ordered feature column list
models/fraud_model.pkl Best model selected by ROC-AUC
models/random_forest.pkl Saved Random Forest
models/xgboost_model.pkl Saved XGBoost
models/lightgbm_model.pkl Saved LightGBM
reports/figures/class_balance.png Fraud vs legitimate count chart
reports/figures/correlation_heatmap.png Feature correlation heatmap
reports/figures/*_confusion_matrix.png Confusion matrix per model
reports/figures/*_roc_curve.png ROC curve per model
reports/figures/*_pr_curve.png Precision-Recall curve per model
reports/figures/*_feature_importance.png Feature importance per model
reports/figures/shap_summary.png SHAP beeswarm plot for best model
reports/report.md Full Markdown training summary
reports/metrics.csv Model comparison table (CSV)

βš™οΈ Configuration

All paths and hyperparameters are centralised in src/utils/config.py. Nothing is hardcoded elsewhere.

Setting Default Description
RANDOM_STATE 42 Seeds all RNGs β€” ensures full reproducibility
TEST_SIZE 0.2 Fraction of data reserved for evaluation
THRESHOLD 0.5 Minimum probability to classify as fraud
SMOTE_RATIO 0.2 Fraud class target ratio after resampling
LARGE_AMOUNT_THRESHOLD 200.0 USD threshold for Is_Large_Amount flag
HF_MODEL_NAME distilgpt2 Hugging Face model for explanation generation

πŸš€ Installation

1. Clone the repository:

git clone https://github.com/ibtesaamaslam/Fraud-Detection-Model.git
cd Fraud-Detection-Model

2. Create a virtual environment (recommended):

python -m venv .venv
source .venv/bin/activate        # macOS / Linux
.venv\Scripts\activate           # Windows

3. Install dependencies:

pip install -r requirements.txt

4. Place the dataset:

Download creditcard.csv from Kaggle, rename it to transactions.csv, and place at:

data/raw/transactions.csv

▢️ Running the Project

Run the full 8-stage pipeline:

python main.py

This executes all stages in order and produces:

  • Processed data in data/processed/
  • Trained models in models/
  • EDA and evaluation charts in reports/figures/
  • Final report at reports/report.md

πŸ“Š Running the Dashboard

After main.py completes:

streamlit run app/streamlit_app.py

β†’ Opens at http://localhost:8501


πŸ“‹ Requirements

pandas
numpy
scikit-learn
imbalanced-learn
xgboost
lightgbm
shap
transformers
joblib
matplotlib
seaborn
streamlit
tabulate
pip install -r requirements.txt

πŸ—ΊοΈ Roadmap

  • FastAPI endpoint β€” expose /predict for banking system integration
  • LIME explainability β€” add alongside SHAP for comparison
  • Threshold optimisation β€” auto-tune decision threshold by maximising F1
  • Deep learning baseline β€” neural network benchmark vs tree models
  • Drift detection β€” flag when retraining is needed
  • Docker containerisation β€” Dockerfile for reproducible deployment
  • MLflow experiment tracking β€” log all runs and metrics
  • Hugging Face Spaces deployment β€” public demo

πŸ“œ License

MIT License β€” Copyright (c) 2024 Ibtesaam Aslam

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies, subject to the copyright notice appearing in all copies.
Provided "as is" β€” without warranty of any kind.

Built with ❀️ by Ibtesaam Aslam

⭐ If this project helped you learn fraud detection or ML pipelines, please give it a star!

Explainable AI for financial fraud detection.

About

πŸ’³ End-to-end ML pipeline for credit card fraud detection β€” 4-model benchmark (LR Β· RF Β· XGBoost Β· LightGBM) Β· SMOTE class balancing Β· SHAP per-prediction explainability Β· DistilGPT-2 plain-English summaries Β· interactive Streamlit dashboard Β· automated Markdown reporting. Trained on 284,807 transactions (0.17% fraud rate).

Topics

Resources

License

Stars

Watchers

Forks

Contributors