Skip to content

asigalov61/midiharmony

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

midiharmony

Fast, data‑driven harmony analysis for any MIDI file

midiharmony

Abstract

midiharmony provides fast, stand‑alone harmony analysis for MIDI files by comparing their chord and note relationships against a high‑quality database of extracted chord quads. This approach enables reliable detection of strong harmonic structure, musically coherent progressions, and potential inconsistencies, making it a practical tool for music‑AI pipelines, composition analysis, and large‑scale MIDI processing.


Install

Standard processing on CPU (default)

!pip install -U midiharmony

Accelerated processing on GPU

!pip install -U midiharmony[gpu]

Basic use examples

Using midiharmony is easy and requires just two lines of code :)

Analyze a single MIDI

import midiharmony

midi_harmony_dict = midiharmony.analyze_midi('Come To My Window.mid')

# This code will return a single dictionary with harmony stats
{'midi_path': 'Come To My Window.mid',
  'midi_name': 'Come To My Window',
  'total_chords_count': 2287,
  'bad_chords_count': 11,
  'grouped_chords_count': 1861,
  'total_quads_count': 1861,
  'unique_quads_count': 644,
  'harmonic_quads_count': 470,
  'harmony_ratio': 0.7298136645962733
}

Analyze MIDI folder(s)

import midiharmony

midi_harmony_dicts_list = midiharmony.analyze_midi_folders(['./midi_folder_1',
                                                            './midi_folder_2',
                                                            './midi_folder_3'
                                                           ])

# This code will return a list of dictionaries with harmony stats
# for all processed MIDIs from all specified folders
[
 {'midi_path': 'midi_folder_1/Bach Violin 2.mid',
  'midi_name': 'Bach Violin 2',
  'total_chords_count': 1089,
  'bad_chords_count': 71,
  'grouped_chords_count': 1045,
  'total_quads_count': 1045,
  'unique_quads_count': 974,
  'harmonic_quads_count': 867,
  'harmony_ratio': 0.8901437371663244},
 {'midi_path': 'midi_folder_2/Camping at Aylm.mid',
  'midi_name': 'Camping at Aylm',
  'total_chords_count': 417,
  'bad_chords_count': 7,
  'grouped_chords_count': 383,
  'total_quads_count': 383,
  'unique_quads_count': 369,
  'harmonic_quads_count': 326,
  'harmony_ratio': 0.8834688346883469},
 {'midi_path': 'midi_folder_3/Come To My Window.mid',
  'midi_name': 'Come To My Window',
  'total_chords_count': 2287,
  'bad_chords_count': 11,
  'grouped_chords_count': 1861,
  'total_quads_count': 1861,
  'unique_quads_count': 644,
  'harmonic_quads_count': 470,
  'harmony_ratio': 0.7298136645962733}
]

NOTES

  • Most important value in each returned midi_harmony_dictionary is the "harmony_ratio"
  • High harmony_ratio (>=0.75) indicates good harmony
  • Exceptional harmony is indicated by harmony_ratio >= 0.9 and high unique_quads_count

Advaced use example

The code snippet below works great if you want to analyze a large MIDI dataset

# Import midiharmony, TMIDIX and tqdm
import midiharmony
from midiharmony import TMIDIX
import tqdm

# Create large MIDI dataset files list
filez = TMIDIX.create_files_list(['clean_midi'])

# Define a tine wrapper for fast MIDI multi-processing
def process(file):
    return midiharmony.process_midi(file, verbose=False)

# Use TMIDIX multi-processing wrapper (Linux)
# Or use joblib if you are on Windows
# Alternativelly, you can use a simple loop as well
output = TMIDIX.multiprocessing_wrapper(process, filez)

# Remove all empty dicts
output = [o for o in output if o]

# Analyze pre-processed MIDI dicts
all_harmony_dicts = []

for o in tqdm.tqdm(output):

    if o['quads']:
        res = midiharmony.analyze_processed_midi(o)
    
        if res:
            all_harmony_dicts.append(res)

# Final sort by harmony ratio
all_harmony_dicts.sort(key=lambda x: -x['harmony_ratio'])

# Save results to a nice json
TMIDIX.write_jsonl(all_harmony_dicts, 'all_harmony_dicts')

midiharmony API reference list

midiharmony.find_quads_fast_cupy
Count matching 4‑chord rows between two arrays using a GPU‑accelerated FNV‑1a hash.

midiharmony.find_quads_fast_numpy
Count matching 4‑chord rows between two arrays using NumPy on CPU.

midiharmony.get_trg_array
Load and cache the target harmonic‑quad array in NumPy or CuPy form.

midiharmony.process_midi
Extract chords, grouped chords, and unique 4‑chord quads from a MIDI file.

midiharmony.analyze_processed_midi
Compare extracted quads against the target database and compute harmony metrics.

midiharmony.analyze_midi
Run the full pipeline: process a MIDI file and evaluate its harmonic quality.

midiharmony.analyze_midi_folders
Batch‑analyze all MIDI files in one or more folders and return harmony reports.

midiharmony.helpers.get_package_data
Return a sorted list of packaged .npz data files with their resolved paths.

midiharmony.helpers.get_normalized_midi_md5_hash
Compute original and normalized MD5 hashes for any MIDI file.

midiharmony.helpers.normalize_midi_file
Normalize a MIDI file and write the normalized version to disk.

midiharmony.helpers.is_installed
Check whether a Debian package is installed using dpkg-query.

midiharmony.helpers._run_apt_get
Internal helper to run apt-get commands with consistent flags and timeout.

midiharmony.helpers.install_apt_package
Idempotently install an apt package with retries, optional sudo, and optional python‑apt.


Project Los Angeles

Tegridy Code 2026