PixelPlay is a multimodal AI application designed to translate visual stimuli into auditory experiences. By leveraging Computer Vision and Cross Modal Vector Embeddings, the system bridges the gap between sight and sound, enabling users to discover music that semantically matches the "vibe" of an input image.
Live Demo: https://pixelplay-demo.streamlit.app/
Test Credentials:
- Username: lord
- Password: 123456
Unlike traditional tag based search engines, PixelPlay utilizes Cross Modal Vector Embeddings to perform semantic matching. The core architecture relies on OpenAI's CLIP (Contrastive Language Image Pretraining) model to project both images and text descriptions of music into a shared 512 dimensional vector space.
When a user uploads an image, the system:
- Encodes the image into a high dimensional vector using the CLIP Visual Encoder.
- Computes the Cosine Similarity between the input image vector and a pre indexed vector database of 10,000 songs.
- Retrieves the tracks with the highest semantic correlation to the visual input.
- Image to Audio Search: Upload any image file (JPG, PNG) to receive a curated list of song recommendations that match the visual content and mood.
- Hybrid Text Refinement: Refine visual search results by adding text context (e.g., "Energetic," "Warm," "Fast paced"). The system mathematically blends the image vector with the text vector to adjust the search trajectory.
- Audio Pivot Search: The "Find Similar" button allows users to pivot from visual search to audio-based search. Selecting this option uses the specific vector of a recommended song to find other tracks with similar audio profiles.
- Integrated Audio Player: Preview recommended tracks directly within the application using the native HTML5 audio player (supports MP3/M4A).
- Spotify Integration: Each song card includes a direct "Open on Spotify" link, allowing users to instantly transition from discovery to full playback on their preferred streaming platform.
- Real Time Metadata: Displays accurate song titles, artist names, genres, and release years. The system prioritizes real time metadata fetched via iTunes API during data enrichment, falling back to dataset values only when necessary.
- Visual Data Analytics: Every recommendation includes a dynamic Radar Chart visualizing key audio metrics:
- Energy: Intensity and activity level.
- Valence: Musical positiveness.
- Danceability: Suitability for dancing.
- Acousticness: Confidence the track is acoustic.
- Secure Authentication: Complete user management system featuring secure login, account registration, and password recovery.
- Password Hashing: User credentials are secured using bcrypt hashing standards.
- Session Management: Persistent session states ensure users remain logged in and retain their search context during navigation.
- Data Enrichment Pipeline: A custom built ETL pipeline fetches and updates metadata (album art, preview URLs) to ensure high quality display data without slowing down runtime performance.
The codebase follows a modular architecture, separating the production application logic from the data engineering pipelines.
app.py: The central controller that orchestrates the Streamlit interface and application flow.logic.py: Contains the core business logic, including CLIP model inference, vector calculations, and data loading routines.ui_components.py: Manages the frontend design system, including custom CSS injection, card rendering, and chart generation.auth_manager.py: Encapsulates all authentication logic, config handling, and security protocols.
clean_data.py: Handles initial cleaning and normalization of the raw CSV dataset.embed_songs.py: Runs the inference batch job to generate 512-dimensional vector embeddings for the entire music catalog.enrich_data.py: A post-processing script that queries external APIs (iTunes) to hydrate the dataset with high-quality metadata, album artwork, and audio preview URLs.
data/: Stores the raw source datasets and intermediate files used during the build process. The production app relies on an optimized pickle file (songs_enriched.pkl) generated by the pipeline.
-
Clone the repository
git clone [https://github.com/Prateek-845/PixelPlay.git] cd PixelPlay -
Install Dependencies Ensure you have Python installed, then run:
pip install -r requirements.txt
-
Run the Application
streamlit run app.py
Note: This project is a portfolio demonstration. User accounts created in the live demo environment are ephemeral and may be reset during system updates or redeployments.