This project performs sentiment analysis, classifying IMDB movie reviews as positive or negative using transfer learning with a pretrained DistilBERT model from the Hugging Face Transformers library.
- ⚙️ Pretrained DistilBERT backbone — a lightweight version of BERT for faster inference
- 🎯 Binary classification head for sentiment prediction
- 📉 Cross-Entropy Loss and AdamW optimizer with learning rate scheduling
- 🧠 Configurable fine-tuning — control how much of DistilBERT to train
- 🧱 Modular architecture — clean separation of model, training, and data logic
- 📊 Real-time training & validation loss visualization
- 🌐 Gradio-powered web interface for live inference demos
- PyTorch – model, training, and inference
- Transformers – pretrained BERT models and tokenizer
- pandas, numpy – data handling
- torch.utils.data – dataset and dataloader
- matplotlib – training/validation loss visualization
- Gradio — interactive web interface for real-time model demos
Below is a preview of the Gradio Interface used for real-time classification:
- Python 3.13+
- Recommended editor: VS Code
- Clone the repository
git clone https://github.com/hurkanugur/BERT-Sentiment-Classifier.git- Navigate to the
BERT-Sentiment-Classifierdirectory
cd BERT-Sentiment-Classifier- Install dependencies
pip install -r requirements.txtView → Command Palette → Python: Create Environment- Choose Venv and your Python version
- Select requirements.txt to install dependencies
- Click OK
assets/
└── app_screenshot.png # Screenshot of the application
data/
└── huggingface_datasets # HuggingFace datasets (IMDB)
model/
└── imdb_sentiment_classifier.pth # Trained model
src/
├── config.py # Paths, hyperparameters, split ratios
├── dataset.py # Data loading & preprocessing
├── device_manager.py # Selects and manages compute device
├── train.py # Training pipeline
├── inference.py # Inference pipeline
├── model.py # Neural network definition
└── visualize.py # Training/validation plots
main/
├── main_train.py # Entry point for training
└── main_inference.py # Entry point for inference
requirements.txt # Python dependenciesPretrained Transformer (distilbert-base-uncased)
↓
[CLS] Token Representation
↓
Classification Head (added automatically)
→ Linear Layer (hidden_size → 2)
→ Softmax (for binary classification)Navigate to the project directory:
cd BERT-Sentiment-ClassifierRun the training script:
python -m main.main_trainor
python3 -m main.main_trainNavigate to the project directory:
cd BERT-Sentiment-ClassifierRun the app:
python -m main.main_inferenceor
python3 -m main.main_inference