Skip to content

atrip0305/hate-speech-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hate Speech Detection System

An end-to-end hate speech detection system that combines classical machine learning and transformer-based NLP models, exposed through a real-time FastAPI inference API.

This project was built with a strong focus on production-ready ML workflows, model benchmarking, and deployment readiness, making it suitable for real-world moderation use cases.


Problem Statement

Online platforms struggle to automatically detect hate speech and offensive language due to:

  • Informal and noisy text (social media)
  • Context-dependent language
  • Class imbalance across hate, offensive, and neutral content

This project addresses the problem by:

  • Benchmarking multiple NLP models
  • Leveraging transformer-based contextual embeddings
  • Deploying a real-time inference API

Dataset

  • Source: Twitter hate speech dataset
  • Classes:
    • Hate speech
    • Offensive language
    • No hate or offensive language
  • Preprocessing: Lowercasing, URL removal, stopword removal, stemming

Dataset file:data/twitter.csv


Project Structure

hate-speech-detection/ │ ├── data/ # Dataset files ├── notebooks/ # EDA and baseline experiments ├── src/ # Preprocessing, training, evaluation scripts ├── transformers/ # Transformer (DistilBERT) training notebook ├── api/ # FastAPI inference service ├── models/ # Saved ML models and vectorizers ├── README.md └── requirements.txt

Model Pipeline

1. Text Preprocessing

Implemented a reusable preprocessing pipeline:

  • Lowercasing
  • URL and punctuation removal
  • Stopword removal
  • Stemming

Used consistently across:

  • Classical ML models
  • Transformer models
  • FastAPI inference API

2. Classical Machine Learning Models

Implemented and benchmarked multiple classical NLP models using TF-IDF features:

  • Logistic Regression (baseline)
  • Support Vector Machine (SVM)
  • Random Forest

Best classical performance:

  • Logistic Regression + TF-IDF
  • Accuracy: ~89.5%

These models provide:

  • Fast inference
  • Low memory usage
  • Suitability for real-time APIs

3. Transformer-Based Model (HuggingFace)

Implemented a transformer-based classifier using DistilBERT via HuggingFace.

  • Model: distilbert-base-uncased
  • Fine-tuned on the hate speech dataset
  • Used HuggingFace Trainer API

Performance:

  • Accuracy: 91.5%
  • Outperformed classical ML baselines
  • Improved contextual understanding of offensive language

Transformer training is documented in: transformers/bert_training.ipynb

Performance Comparison

Model Accuracy
TF-IDF + Logistic Regression ~89.5%
TF-IDF + SVM ~88%
TF-IDF + Random Forest ~84%
DistilBERT (Transformer) 91.5%

Real-Time Inference API (FastAPI)

A FastAPI service exposes the trained classical ML model for real-time moderation.

Endpoint

POST /predict

Example Request

{ "text": "Let's unite and kill all the people protesting" } Example Response json Copy code { "prediction": "Hate speech" }

Why Classical ML for the API?

Faster inference Lower latency Suitable for production moderation pipelines Transformer models are retained for offline analysis and benchmarking.

Technologies Used Languages: Python ML & NLP: scikit-learn, NLTK, HuggingFace Transformers Deep Learning: PyTorch API: FastAPI, Uvicorn Deployment: Docker (optional), Render/Railway Version Control: Git, GitHub

How to Run Locally

Install dependencies

pip install -r requirements.txt

Run the API

uvicorn api.main:app --reload Open: http://127.0.0.1:8000/docs

Key Takeaways

Built a complete NLP pipeline from data to deployment Benchmarked classical ML vs transformer-based models Fine-tuned DistilBERT using HuggingFace Deployed a real-time hate speech detection API using FastAPI

Future Improvements

Deploy transformer model for async batch inference Add confidence scores and thresholds Integrate model monitoring and logging Extend to multilingual hate speech detection

About

A machine learning–based hate speech detection system that classifies text into hateful and non-hateful categories using transformer-based NLP models. Built with an end-to-end pipeline including preprocessing, training, evaluation, and inference.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors