Skip to content

This project implements a machine learning-driven movie recommendation system. It preprocesses movie and credits data, applies natural language processing for feature extraction, and uses cosine similarity for personalized movie suggestions through a streamlined Streamlit web interface.

Notifications You must be signed in to change notification settings

shrutikakapade/Movie-Recommendation-System-With-Using-Machine-Learning

Repository files navigation

Project Description: Movie Recommendation System Using Machine Learning

Overview :

This project aims to build a movie recommendation system using machine learning techniques. The system leverages various Python packages such as numpy, pandas, ast, sklearn, and nltk for data processing, feature extraction, and natural language processing. Additionally, streamlit is used for developing a user-friendly web application to interact with the recommendation system.

Data Description:

We use two primary datasets:

  • Movies Data: This dataset contains information about movies, including attributes such as budget, genres, homepage, id, keywords, original language, original title, overview,popularity, production companies, production countries, release date, revenue, runtime, spoken languages, status, tagline, title, vote average, and vote count.
  • Credits Data: This dataset includes information about the cast and crew of the movies, with attributes such as movie_id, title, cast, and crew.

Data Preprocessing:

  • Merging Datasets: Merge the movies and credits datasets on the common attribute, typically movie_id or id.
  • Cleaning and Transforming Data: - Handle missing values and inconsistencies. - Extract relevant information from nested columns such as genres, keywords, cast, and crew using the ast library.
  • Natural Language Processing:Use the nltk library, particularly the 'PorterStemmer' class, to stem text data. This involves reducing words to their base or root form to ensure consistency in text-based features.

Feature Engineering:

  • Text Vectorization: Convert textual data such as genres, keywords, overview, cast, and crew into numerical vectors using techniques like TF-IDF (Term Frequency- Inverse Document Frequency) from sklearn.feature_extraction.
  • Distance Calculation: Compute the cosine similarity between movie vectors. Cosine similarity is preferred over Euclidean distance due to its effectiveness in high-dimensional spaces and its ability to measure the cosine of the angle between vectors, indicating their orientation and, consequently, their similarity.

Machine Learning Model

  • Similarity Calculation:For each movie, calculate its similarity with every other movie using cosine similarity. Recommend movies based on the highest similarity scores.

Web Application:

  • Building the App: - Use streamlit to develop an interactive web application. - Load the preprocessed data and model using pickle. - Create a user interface to input a movie title and fetch recommendations.
  • API Integration:Use the requests library to fetch additional data from external APIs if needed (e.g.,movie posters, additional details).

Output

Output UI

Conclusion:

This movie recommendation system effectively utilizes machine learning techniques and natural language processing to provide personalized movie suggestions. By integrating various Python libraries for data processing, feature extraction, and web application development, we deliver a robust and interactive user experience.

About

This project implements a machine learning-driven movie recommendation system. It preprocesses movie and credits data, applies natural language processing for feature extraction, and uses cosine similarity for personalized movie suggestions through a streamlined Streamlit web interface.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published