Explainable Telecom Fraud Detection Platform using Machine Learning, Rule-Based Intelligence, SHAP, FastAPI, and Streamlit
TeleSentry AI is an end-to-end Telecom Fraud Detection Platform designed to identify suspicious calling behavior using a combination of:
- Rule-Based Fraud Intelligence
- Isolation Forest Anomaly Detection
- Random Forest Classification
- SHAP Explainability
- Interactive Streamlit Dashboard
- FastAPI Prediction Service
The system simulates realistic telecom users and fraudsters, engineers behavioral telecom features, detects suspicious activities, explains predictions, and exposes results through a dashboard and API.
Telecommunication fraud has become increasingly sophisticated.
Common fraud patterns include:
- Digital Arrest Scams
- Mass Calling Operations
- Long Distance Fraud Rings
- Social Engineering Networks
- Automated Calling Bots
Traditional rule-based systems fail to detect new fraud patterns, while pure machine learning systems often lack interpretability.
TeleSentry AI combines both approaches to deliver:
- High detection accuracy
- Transparent predictions
- Real-time fraud assessment
┌─────────────────────────────────────────┐
│ Synthetic Data Generator │
│ (Telecom User Simulation) │
└──────────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Raw Synthetic Dataset │
│ generated_dataset.csv (13k+) │
└──────────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Data Preprocessing Layer │
│ │
│ • Cleaning │
│ • Validation │
│ • Train/Test Split │
└──────────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Feature Engineering Layer │
│ │
│ • call_intensity │
│ • distance_per_call │
│ • contact_circle_ratio │
│ • delivery_pattern │
│ • high_freq_long_distance │
└──────────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Rule Engine Layer │
│ │
│ • Digital Arrest Detection │
│ • Mass Calling Detection │
│ • Long Distance Scam Detection │
│ • Traveler Detection │
│ • Business User Detection │
└──────────────────┬──────────────────────┘
│
▼
┌─────────────────────────┐
│ ML Layer │
│ │
│ Isolation Forest │
│ Random Forest │
└───────────┬─────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Evaluation Layer │
│ │
│ Accuracy │
│ Precision │
│ Recall │
│ F1 Score │
│ ROC-AUC │
└──────────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Explainability Layer │
│ │
│ SHAP Summary │
│ SHAP Waterfall │
│ Feature Importance │
└──────────────────┬──────────────────────┘
│
┌────────┴─────────┐
▼ ▼
┌────────────────┐ ┌──────────────────┐
│ Streamlit UI │ │ FastAPI API │
│ │ │ │
│ Dashboard │ │ /predict │
│ Analytics │ │ /health │
│ Live Predict │ │ Swagger Docs │
└────────────────┘ └──────────────────┘
Synthetic Data Generation
↓
Data Preprocessing
↓
Feature Engineering
↓
Rule Engine
↓
Machine Learning Layer
↓
Evaluation Layer
↓
SHAP Explainability
↓
Streamlit Dashboard + FastAPI
TeleSentry-AI/
│
├── api/
├── dashboard/
├── data/
├── notebooks/
├── reports/
├── saved_models/
├── src/
├── tests/
│
├── README.md
├── requirements.txt
├── requirements-lock.txt
├── LICENSE
├── VERSION
└── .env.example
Generates realistic telecom profiles:
- Delivery Partners
- Business Users
- Regular Subscribers
- Traveling Professionals
- Digital Arrest Bots
- Traditional Scammers
- Low Volume Fraudsters
Generated telecom intelligence features:
| Feature | Description |
|---|---|
| call_intensity | Calling activity level |
| distance_per_call | Average call distance ratio |
| contact_circle_ratio | Contact diversity ratio |
| delivery_pattern | Delivery behavior pattern |
| high_freq_long_distance | Suspicious high-volume calling |
Fraud intelligence layer:
- Digital Arrest Detection
- Mass Calling Detection
- Long Distance Scam Detection
- Traveler Detection
- Business User Detection
- Delivery Pattern Detection
Purpose:
- Unsupervised anomaly detection
- Detection of unusual telecom behavior
Purpose:
- Supervised fraud classification
- Fraud probability estimation
| Metric | Score |
|---|---|
| Accuracy | 98%+ |
| Precision | 97%+ |
| Recall | 98%+ |
| F1 Score | 98%+ |
| ROC-AUC | 99%+ |
TeleSentry AI uses SHAP (SHapley Additive Explanations).
Generated explanations include:
- SHAP Summary Plot
- SHAP Waterfall Plot
- Feature Importance Analysis
Top fraud indicators:
- avgCallDistance
- circleDiversity
- call_intensity
- avgDuration
- high_freq_long_distance
Interactive Streamlit dashboard provides:
- Dataset statistics
- Fraud distribution
- User type analysis
- Operator analysis
- Accuracy
- Precision
- Recall
- F1 Score
- ROC Curve
- Confusion Matrix
Predict fraud risk using telecom activity metrics.
Visualize fraud intelligence triggers.
Interpret model decisions.
Endpoints:
GET /GET /healthPOST /predictExample Request:
{
"avg_duration": 5,
"call_frequency": 150,
"unique_contacts": 100,
"avg_distance": 600,
"circle_diversity": 8
}Example Response:
{
"prediction": "FRAUD",
"fraud_probability": 0.98,
"risk_level": "CRITICAL"
}git clone https://github.com/7vik2005/TeleSentry-AI.git
cd TeleSentry-AIpip install -r requirements.txtpython -m src.data_generation.generatorpython -m src.rule_engine.rulespython -m src.models.random_forestpython -m src.explainability.shap_explainerpython -m streamlit run dashboard/app.pypython -m uvicorn api.app:app --reload- Python
- Pandas
- NumPy
- Scikit-Learn
- SHAP
- FastAPI
- Streamlit
- Plotly
- Matplotlib
- Faker
- XGBoost Integration
- Real Telecom Data Support
- Real-Time Streaming Detection
- Docker Deployment
- Cloud Deployment
- Automated Retraining Pipeline
- MLOps Integration
Machine Learning | Data Science | AI Engineering
This project is licensed under the MIT License.