fix(discovery): fall back to directory name for audio model type detection by ryancee · Pull Request #849 · jundot/omlx

ryancee · 2026-04-18T16:50:11Z

Summary

Models whose config.json omits a top-level model_type field were being classified as llm by detect_model_type(), causing a KeyError('model_type') crash at inference time.

Root Cause

detect_model_type() in model_discovery.py identifies audio models exclusively by matching config["model_type"] and config["architectures"] against the mlx-audio model-type sets. Models using NeMo-format config files — notably parakeet-tdt models — do not include a top-level model_type field. This causes:

detect_model_type() falls through all audio checks and returns "llm"
A BatchedEngine (LLM engine) is loaded for the model
The LLM loading path calls config["model_type"] without a .get() fallback
KeyError: 'model_type' is raised — reported to the client as HTTP 500

POST /v1/audio/transcriptions → 500: 'model_type'

The model itself works fine when loaded directly via mlx_audio.stt.utils.load_model, which already has its own directory-name fallback in base_load_model.

Fix

After all config-based checks, extract the first hyphen-separated segment of the model directory name and match it against AUDIO_STT/TTS/STS_MODEL_TYPES — the same sets already used for config-based detection. The segment is guarded against _LLM_TYPE_COLLISIONS to prevent false positives on directories named llama-..., qwen3-..., etc.

# e.g. "parakeet-tdt-0.6b-v2" → stem = "parakeet" → matches AUDIO_STT_MODEL_TYPES → "audio_stt"
dir_stem = model_path.name.lower().split("-")[0]
if dir_stem and dir_stem not in _LLM_TYPE_COLLISIONS:
    if dir_stem in AUDIO_STT_MODEL_TYPES:
        return "audio_stt"
    ...

This mirrors the existing logic in mlx_audio.utils.base_load_model, which already resolves model type from the directory name when config.json is missing the field.

Affected Models

Any audio model with a NeMo-format config, including:

parakeet-tdt-0.6b-v2 (confirmed: was returning HTTP 500 before fix, now correctly detected as audio_stt)
Any future parakeet or NeMo-converted model added to ~/.omlx/models/

Note

A companion PR to Blaizzy/mlx-audio#657 adds model_type: "parakeet" injection in ModelConfig.from_dict(), which addresses the root cause at the mlx-audio level. This omlx fix provides defense-in-depth for any other audio models that may have the same omission.

…ction Models whose config.json omits a top-level model_type field (e.g. parakeet-tdt models, which use NeMo-format config files) were being classified as 'llm' because detect_model_type() only inspects the architectures[] and model_type fields from config.json. This caused a KeyError('model_type') at inference time: the wrong engine (BatchedEngine/LLM) was loaded for the model, and that engine's LLM loading path called config['model_type'] without a .get() fallback. Fix: after all config-based checks, extract the first hyphen-separated segment of the model directory name and match it against the same mlx-audio AUDIO_STT/TTS/STS_MODEL_TYPES sets used for config-based detection. The stem is excluded from _LLM_TYPE_COLLISIONS to avoid false positives (e.g. a directory named 'llama-...' should not be detected as audio). This matches how mlx_audio.utils.base_load_model already resolves model type from path when config is missing the field.

ryancee force-pushed the fix/audio-model-discovery-dirname-fallback branch from 90ca033 to 32d6ec7 Compare April 18, 2026 16:53

jundot force-pushed the main branch 2 times, most recently from 7844f15 to b078330 Compare April 28, 2026 02:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(discovery): fall back to directory name for audio model type detection#849

fix(discovery): fall back to directory name for audio model type detection#849
ryancee wants to merge 1 commit intojundot:mainfrom
ryancee:fix/audio-model-discovery-dirname-fallback

ryancee commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ryancee commented Apr 18, 2026

Summary

Root Cause

Fix

Affected Models

Note

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant