A portfolio-quality movie recommendation system built with Streamlit, using content-based filtering and machine learning to suggest films based on user preferences.
Coded by Nate
- Smart Recommendations: Content-based filtering using TF-IDF and cosine similarity
- Rich Metadata: Integrates with TMDB API for movie posters, cast, and plot information
- Explainable AI: Clear explanations for why each movie was recommended
- Flexible Input: Select 1-5 favorite movies to get personalized suggestions
- Professional UI: Clean, responsive design with visual movie cards
- Performance Optimized: Caching for sub-2-second recommendation generation
This project demonstrates:
- Data Science: Content-based recommendation algorithm with weighted feature engineering
- Machine Learning: TF-IDF vectorization and cosine similarity calculations using scikit-learn
- API Integration: TMDB API for real-time movie data
- Software Engineering: Modular code architecture with separation of concerns
- User Experience: Intuitive interface with visual feedback and explanations
| Category | Technologies |
|---|---|
| Frontend | Streamlit, HTML/CSS |
| Backend | Python 3.11 |
| ML/Data Science | scikit-learn, pandas, NumPy |
| APIs | TMDB API |
| DevOps | GitHub Actions (automated cache updates) |
| Deployment | Streamlit Cloud |
- Python 3.9 or higher
- TMDB API key (free at themoviedb.org)
-
Clone the repository
git clone https://github.com/yourusername/movie-recommender.git cd movie-recommender -
Install dependencies
pip install -r requirements.txt
-
Set up environment variables
cp .env.example .env # Edit .env and add your TMDB API key -
Run the app
streamlit run app.py
-
Open in browser
http://localhost:8501
- Sign up at TMDB
- Go to Settings → API
- Request an API key (choose "Developer" option)
- Add to
.envfile:TMDB_API_KEY=your_api_key_here
Customize the look in .streamlit/config.toml:
[theme]
primaryColor = "#E50914"
backgroundColor = "#141414"
secondaryBackgroundColor = "#2D2D2D"
textColor = "#FFFFFF"
font = "sans serif"The recommendation engine uses content-based filtering with the following approach:
-
Feature Extraction
- Plot descriptions → TF-IDF vectors
- Genres → Binary feature vectors
- Director → Exact match indicator
- Cast → Overlap calculation
-
Similarity Calculation
- Weighted combination of features:
- Plot similarity (TF-IDF + cosine): 40%
- Genre overlap (Jaccard): 30%
- Director match: 15%
- Cast overlap: 15%
- Weighted combination of features:
-
Ranking & Filtering
- Exclude already-selected movies
- Sort by combined similarity score
- Return top N recommendations
If you select:
- "Inception" (2010)
- "Interstellar" (2014)
- "The Prestige" (2006)
The app will:
- Identify common themes (Christopher Nolan, complex narratives, sci-fi)
- Calculate similarity to all movies in database
- Recommend similar films like "Arrival", "Shutter Island", "Memento"
movie-recommender/
├── app.py # Main Streamlit application
├── recommender.py # Recommendation engine logic
├── data_loader.py # TMDB API integration
├── utils.py # Helper functions
├── requirements.txt # Python dependencies
├── README.md # This file
├── .env.example # Environment variables template
├── .streamlit/
│ └── config.toml # Streamlit configuration
└── tests/ # Unit tests (optional)
└── test_recommender.py
Run unit tests:
pytest tests/Test the app with diverse movie selections:
- Single movie input
- 5 movies from different genres
- Movies from same franchise
- Obscure titles
- Initial load: < 3 seconds (with caching)
- Recommendation generation: < 2 seconds
- Dataset size: 500 movies (configurable)
- API calls: Cached for 1 hour
- Push code to GitHub
- Go to share.streamlit.io
- Deploy your repository
- Add
TMDB_API_KEYas a secret in the dashboard
- Heroku: Use
setup.shandProcfile - AWS/GCP: Deploy with Docker
- Vercel: Use serverless deployment
-
Mood Filters (Phase 3)
- Edit
app.pyto add checkboxes for moods - Update
recommender.pyto filter by mood tags
- Edit
-
Data Visualizations
- Add Plotly charts to show genre distributions
- Compare user preferences vs. recommendations
-
User Accounts
- Integrate Streamlit authentication
- Save user preferences and history
Edit recommender.py to adjust:
- Feature weights: Change the 40/30/15/15 split
- Similarity metric: Try Pearson correlation instead of cosine
- Additional features: Add movie runtime, budget, keywords
- Collaborative filtering (user-based recommendations)
- Hybrid approach (content + collaborative)
- Movie watchlist functionality
- Export recommendations to PDF
- Integration with streaming services (show where to watch)
- Multi-language support
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit changes (
git commit -m 'Add AmazingFeature') - Push to branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see LICENSE file for details.
- TMDB for the excellent movie database API
- Streamlit for the intuitive web framework
- scikit-learn for ML tools
Your Name
- Portfolio: yourportfolio.com
- LinkedIn: linkedin.com/in/yourprofile
- GitHub: github.com/yourusername
⭐ If you found this project helpful, please consider giving it a star!
