This is a content-based Movie Recommendation System built using Python and machine learning libraries. The system suggests movies based on similarity of features such as genre, keywords, tagline, cast, and director.
- Content-based filtering using movie metadata
- Movie similarity calculated using TF-IDF and Cosine Similarity
- Input a movie name and get similar movie recommendations
- Handles missing values in metadata gracefully
- Programming Language: Python
- Libraries:
- pandas – data manipulation
- numpy – numerical operations
- scikit-learn – TF-IDF Vectorization and Cosine Similarity
- difflib – finding close matches for user input
Source: movies.csv[https://raw.githubusercontent.com/Sourav-10x/MovieRecommendationSystem/refs/heads/main/movies.csv] (from TMDB or similar open-source movie metadata datasets)
- Size: 4803 movies with 24 columns of metadata
- Selected Features for Recommendation:
- genres
- keywords
- tagline
- cast
- director
-
Data Preprocessing:
- Load movie dataset
- Select key metadata features
- Replace missing values with empty strings
- Combine all selected features into a single string
-
Feature Vectorization:
- Use TfidfVectorizer to convert text data into numerical feature vectors
-
Similarity Measurement:
- Compute cosine similarity between all movies based on their feature vectors
-
Recommendation:
- Take user input (movie title)
- Find the closest matching title
- Recommend top N most similar movies based on cosine similarity
This project runs best in Google Colab
-
Open the notebook:
Open in Google Colab -
Upload the movies.csv file when prompted.
-
Run all the cells and enter a movie name when asked.
- Clone the repo:
git clone https://github.com/Sourav-10x/movie-recommendation-system.git cd movie-recommendation-system - Install requirements: pip install -r requirements.txt
- Open the Jupyter Notebook: jupyter notebook
Example Usage
Enter your favourite movie name : Iron Man
Recommended movies:
- Iron Man 2
- Avengers: Age of Ultron
- Captain America: The First Avenger
...