MIWA - Music Indexing with AI

MIWA (Music Indexing with AI) is an intelligent music search system that helps users find songs using natural language queries. Instead of requiring exact song titles or artist names, users can describe songs using partial lyrics, mood, genre, artist characteristics, location, or any combination of these attributes.

🎯 Features

Natural Language Query Processing: Extract structured information from conversational queries using Claude AI
Multi-Modal Search: Search by:
- Song title keywords
- Lyrics (exact keywords or semantic similarity)
- Artist name and characteristics
- Genre
- Release year (ranges)
- Geographic location (country or region)
- Album information
- Featured artists
GraphRAG Integration: Powered by Neo4j graph database for efficient relationship-based retrieval
Hybrid Retrieval: Combines full-text search (Lucene) and vector embeddings (Jina embeddings) for optimal results
Interactive Streamlit Demo: User-friendly web interface for testing queries
Benchmarking & Evaluation: Tools for testing extraction accuracy and retrieval performance

🏗️ Architecture

System Components

LLM Extraction Layer (app.py)
- Uses Anthropic Claude API with COSTAR-formatted prompts
- Extracts structured JSON from natural language queries
- Outputs XML for reliable parsing
GraphRAG Database (Neo4j)
- Stores tracks, artists, albums, genres, locations, lyrics, and descriptions
- Full-text indexes for keyword search
- Vector indexes for semantic similarity search
- Relationship-based queries for complex filtering
Retrieval Engine (app.py, calculate_accuracy.py)
- Multi-stage filtering and scoring system
- Combines multiple score types (title, lyrics, artist, album)
- Softmax normalization for balanced scoring
- Returns top-K results ranked by relevance
Evaluation Tools
- benchmark_extraction.py: Tests field extraction accuracy
- calculate_accuracy.py: Evaluates retrieval performance (Top-1, Top-3, Top-10, Top-15)
- generate_prompts.py: Generates diverse test prompts using OLLAMA

📋 Prerequisites

Python 3.8+
Neo4j database (version 5.x or later)
Anthropic API key (for Claude)
(Optional) OLLAMA (for prompt generation)

🚀 Installation

Clone the repository

git clone <repository-url>
cd MusicIndexingWithAI

Install dependencies
```
pip install -r requirements.txt
```

Set up environment variables

Create a .env file in the project root:

# Anthropic API (required)
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# Neo4j Connection (required)
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password_here

Set up Neo4j Database
- Install and start Neo4j
- Create the graph structure using src/create_graph.py
- Load your music data into Neo4j
- The system will automatically create necessary indexes

💻 Usage

Running the Streamlit Demo

streamlit run app.py

The demo will open in your browser. You can:

Enter natural language queries about songs
View extracted structured information
See retrieval results and AI-generated suggestions
Enable debug mode to inspect extraction details

Example Queries

"Find songs with 'love' in the title from the 2000s"
"Rock songs by Norwegian artists with lyrics about mountains"
"Songs featuring Drake released after 2015"
"Pop songs from Europe with lyrics mentioning 'dancing'"
"Songs by artists founded in the 1960s from the United Kingdom"

Benchmarking Extraction

Test the extraction system on generated prompts:

python3 benchmark_extraction.py

This will:

Load prompts from data_parsing/data/generated_prompts.json
Extract fields for each prompt
Generate a benchmark report
Update the prompts file with extracted JSON

Calculating Retrieval Accuracy

Evaluate retrieval performance:

python3 calculate_accuracy.py

This will:

Load prompts with expected track information
Run retrieval for each prompt
Calculate Top-1, Top-3, Top-10, Top-15 accuracy
Generate accuracy results JSON files

Generating Test Prompts

Generate diverse prompts for testing:

python3 generate_prompts.py

Note: Requires OLLAMA running locally with the qwen3:14b model.

📁 Project Structure

MusicIndexingWithAI/
├── app.py                      # Main Streamlit application
├── benchmark_extraction.py     # Extraction benchmarking script
├── calculate_accuracy.py       # Retrieval accuracy evaluation
├── generate_prompts.py         # Test prompt generation (OLLAMA)
├── requirements.txt            # Python dependencies
├── README.md                   # This file
│
├── data/                       # Sample data files
│   ├── artists_locations_countries_only.json
│   ├── genres_sample.json
│   └── locations_sample.json
│
├── data_parsing/               # Data processing scripts
│   ├── data/
│   │   ├── generated_prompts.json    # Generated test prompts
│   │   ├── sample_100.json           # Sample tracks for prompt generation
│   │   └── datascripts/              # Data processing utilities
│   ├── eda/                    # Exploratory data analysis
│   └── utils/                  # Parsing utilities
│
├── results/                    # Evaluation results
│   ├── benchmark_report_*.json
│   └── accuracy_results_*.json
│
└── src/                        # Source code
    └── create_graph.py         # Neo4j graph creation script

🔧 Key Components

Extraction System

The extraction system uses a COSTAR-formatted prompt to extract structured information:

{
  "track": {
    "title_keywords": ["love", "heart"],
    "year_from": 2000,
    "year_to": 2010,
    "genres": ["rock", "pop"],
    "lyrics_keywords": ["tears"],
    "lyrics_text": "songs about crying and sadness"
  },
  "artist": {
    "name_keywords": ["beatles"],
    "country": "United Kingdom",
    "region": null,
    "description_text": "legendary rock band"
  },
  "features": ["Drake"],
  "album": {
    "name_keywords": ["thriller"]
  }
}

Retrieval System

The retrieval system uses a multi-stage approach:

Filtering: Apply numeric filters (year ranges, views) first
Scoring: Calculate relevance scores for:
- Title match (full-text search)
- Lyrics match (full-text + vector similarity)
- Artist name match (full-text search)
- Artist description (vector similarity)
- Album match (full-text search)
Ranking: Combine normalized scores and return top-K results

👥 Contributors

Serhii Dmytryshyn - GraphRAG and retrieval implementation, Neo4j integration
Zakhar Kohut - Evaluation tools, benchmarking scripts, LLM integration and prompt generation
Andrii Kravchuk - Data parsing and preprocessing

🙏 Acknowledgments

Anthropic for Claude API
Neo4j for graph database
Jina AI for embeddings model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MIWA - Music Indexing with AI

🎯 Features

🏗️ Architecture

System Components

📋 Prerequisites

🚀 Installation

💻 Usage

Running the Streamlit Demo

Example Queries

Benchmarking Extraction

Calculating Retrieval Accuracy

Generating Test Prompts

📁 Project Structure

🔧 Key Components

Extraction System

Retrieval System

👥 Contributors

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
data_parsing		data_parsing
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
accuracy_comparison.png		accuracy_comparison.png
app.py		app.py
benchmark_extraction.py		benchmark_extraction.py
calculate_accuracy.py		calculate_accuracy.py
generate_prompts.py		generate_prompts.py
requirements.txt		requirements.txt
retrieve_test.py		retrieve_test.py

Folders and files

Latest commit

History

Repository files navigation

MIWA - Music Indexing with AI

🎯 Features

🏗️ Architecture

System Components

📋 Prerequisites

🚀 Installation

💻 Usage

Running the Streamlit Demo

Example Queries

Benchmarking Extraction

Calculating Retrieval Accuracy

Generating Test Prompts

📁 Project Structure

🔧 Key Components

Extraction System

Retrieval System

👥 Contributors

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages