Skip to content

morikonon/imdb_service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎬 IMDB Sentiment Analysis Microservice

This repository contains a complete machine learning pipeline for Sentiment Analysis on the IMDB Dataset. The project covers everything from experimental research and model fine-tuning to deploying a production-ready API and an interactive user interface.

🚀 Overview

The goal of this project was to compare three popular Transformer architectures and understand their performance trade-offs:

  1. BERT (The standard baseline)
  2. DistilBERT (The lightweight, fast alternative)
  3. RoBERTa (The optimized, robust version)

🛠 Tech Stack

  • Modeling: Hugging Face Transformers, PyTorch
  • API: FastAPI, Uvicorn
  • Web UI: Gradio
  • Containerization: Docker
  • Analysis: Jupyter Notebooks, Pandas

📂 Project Structure

  • imdb_notebook.ipynb — The research core: Data preprocessing, Fine-tuning, and Evaluation.
  • main.py — FastAPI backend that serves the model predictions.
  • app.py — Gradio interface for real-time visual testing.
  • requirements.txt — Python dependencies.
  • Dockerfile — Configuration for containerized deployment.

⚙️ Installation & Usage

1. Local Setup

# Clone the repository
git clone https://github.com/morikonon/imdb_service.git
cd imdb_service

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # macOS/Linux
# venv\Scripts\activate   # Windows

# Install dependencies
pip install -r requirements.txt

2. Run the API (FastAPI)

Launch the backend server. You can access the interactive Swagger documentation at http://localhost:8000/docs.

uvicorn main:app --reload

3. Run the UI (Gradio)

Launch the web interface for a user-friendly way to test sentences.

python app.py

4. Docker Deployment

docker build -t imdb-service .
docker run -p 8000:8000 imdb-service

📈 Key Results

  • DistilBERT provided the best latency-to-accuracy ratio, making it ideal for edge deployment.
  • RoBERTa achieved the highest overall accuracy after 3 epochs of fine-tuning.
  • All models were evaluated using Accuracy and F1-Score.

📝 Roadmap

  • Fine-tune BERT, DistilBERT, and RoBERTa
  • Build REST API with FastAPI
  • Develop interactive Gradio UI
  • Implement ONNX Runtime for faster inference
  • Add Prometheus metrics for monitoring

Author: Mukhamedali

About

End-to-end NLP microservice for IMDB Sentiment Analysis. Features fine-tuned Transformers (BERT, DistilBERT, RoBERTa), a FastAPI backend, and an interactive Gradio UI.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages