An end-to-end hate speech detection system that combines classical machine learning and transformer-based NLP models, exposed through a real-time FastAPI inference API.
This project was built with a strong focus on production-ready ML workflows, model benchmarking, and deployment readiness, making it suitable for real-world moderation use cases.
Online platforms struggle to automatically detect hate speech and offensive language due to:
- Informal and noisy text (social media)
- Context-dependent language
- Class imbalance across hate, offensive, and neutral content
This project addresses the problem by:
- Benchmarking multiple NLP models
- Leveraging transformer-based contextual embeddings
- Deploying a real-time inference API
- Source: Twitter hate speech dataset
- Classes:
- Hate speech
- Offensive language
- No hate or offensive language
- Preprocessing: Lowercasing, URL removal, stopword removal, stemming
Dataset file:data/twitter.csv
hate-speech-detection/ │ ├── data/ # Dataset files ├── notebooks/ # EDA and baseline experiments ├── src/ # Preprocessing, training, evaluation scripts ├── transformers/ # Transformer (DistilBERT) training notebook ├── api/ # FastAPI inference service ├── models/ # Saved ML models and vectorizers ├── README.md └── requirements.txt
Implemented a reusable preprocessing pipeline:
- Lowercasing
- URL and punctuation removal
- Stopword removal
- Stemming
Used consistently across:
- Classical ML models
- Transformer models
- FastAPI inference API
Implemented and benchmarked multiple classical NLP models using TF-IDF features:
- Logistic Regression (baseline)
- Support Vector Machine (SVM)
- Random Forest
Best classical performance:
- Logistic Regression + TF-IDF
- Accuracy: ~89.5%
These models provide:
- Fast inference
- Low memory usage
- Suitability for real-time APIs
Implemented a transformer-based classifier using DistilBERT via HuggingFace.
- Model:
distilbert-base-uncased - Fine-tuned on the hate speech dataset
- Used HuggingFace
TrainerAPI
Performance:
- Accuracy: 91.5%
- Outperformed classical ML baselines
- Improved contextual understanding of offensive language
Transformer training is documented in: transformers/bert_training.ipynb
| Model | Accuracy |
|---|---|
| TF-IDF + Logistic Regression | ~89.5% |
| TF-IDF + SVM | ~88% |
| TF-IDF + Random Forest | ~84% |
| DistilBERT (Transformer) | 91.5% |
A FastAPI service exposes the trained classical ML model for real-time moderation.
POST /predict
{ "text": "Let's unite and kill all the people protesting" } Example Response json Copy code { "prediction": "Hate speech" }
Faster inference Lower latency Suitable for production moderation pipelines Transformer models are retained for offline analysis and benchmarking.
Technologies Used Languages: Python ML & NLP: scikit-learn, NLTK, HuggingFace Transformers Deep Learning: PyTorch API: FastAPI, Uvicorn Deployment: Docker (optional), Render/Railway Version Control: Git, GitHub
pip install -r requirements.txt
uvicorn api.main:app --reload Open: http://127.0.0.1:8000/docs
Built a complete NLP pipeline from data to deployment Benchmarked classical ML vs transformer-based models Fine-tuned DistilBERT using HuggingFace Deployed a real-time hate speech detection API using FastAPI
Deploy transformer model for async batch inference Add confidence scores and thresholds Integrate model monitoring and logging Extend to multilingual hate speech detection