A Full-Stack AI Web Application
This project is a movie recommendation engine that suggests films based on content similarity. It features a decoupled architecture with a FastAPI backend and a Streamlit frontend, both deployed in the cloud.
The system follows a Microservices Architecture:
- Backend (FastAPI): Handles the machine learning logic, similarity scores, and TMDB API integration. Hosted on Render.
- Frontend (Streamlit): Provides a responsive UI for users to search movies and view recommendations. Hosted on Streamlit Community Cloud.
The recommendation engine follows a sophisticated NLP workflow to understand movie context:
- Data Engineering:
- Cleaned a dataset of 45,000+ movies.
- Handled complex data structures using
ast.literal_evalto extract genres from stringified dictionaries.
- Text Preprocessing (NLP):
- Tokenization & Regex: Removed noise and punctuations.
- Stopword Removal: Filtered non-informative English words.
- Lemmatization: Used
WordNetLemmatizerto reduce words to their root form (e.g., "running" to "run").
- Vectorization:
- Implemented TF-IDF (Term Frequency-Inverse Document Frequency) with an
ngram_rangeof (1,2). This captures both individual words and meaningful pairs of words.
- Implemented TF-IDF (Term Frequency-Inverse Document Frequency) with an
- Similarity Engine:
- Utilized Cosine Similarity to calculate the distance between movie vectors in a 50,000-dimensional space.
├── data/ # Raw Data
├── models/ # Serialized .pkl files (TF-IDF matrix, indices, model)
├── notebooks/ # recommendation.ipynb (Research & EDA)
├── src/
│ ├── app.py # Streamlit UI Source Code
│ └── main.py # FastAPI Backend Source Code
├── requirements.txt # Library dependencies
└── runtime.txt # Python version specification (3.11.9)
- Language: Python 3.11.9
- Machine Learning: Scikit-learn (TF-IDF Vectorizer & Cosine Similarity), Pandas, NumPy, NLTK, Joblib
- Web Frameworks: FastAPI (Backend), Streamlit (Frontend)
- APIs: TMDB (The Movie Database) for posters and metadata
- Deployment: Render (Backend), Streamlit Cloud (Frontend)
Step 1: Clone the Repository
git clone https://github.com/itsdakshjain/Movie-Recommendation-System
cd Movie-Recommendation-SystemStep 2: Create a Virtual Environment
python -m venv venv
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activateStep 3: Install Dependencies
pip install -r requirements.txtStep 4: Run the Application
#Run Backend
uvicorn src.main:app --reload
#Run Frontend
streamlit run src/app.pyThis project is licensed under the MIT License—a permissive license that allows for personal and commercial use while providing a disclaimer of warranty. See the LICENSE file for the full text.