Skip to content

fix(providers): container slots advertise hal0 registry id via --alias (cutover routing)#675

Merged
thinmintdev merged 1 commit into
mainfrom
fix/container-slot-model-alias
Jun 9, 2026
Merged

fix(providers): container slots advertise hal0 registry id via --alias (cutover routing)#675
thinmintdev merged 1 commit into
mainfrom
fix/container-slot-model-alias

Conversation

@thinmintdev

Copy link
Copy Markdown
Contributor

Why

Found during the #662 live cutover: hal0/ virtual model names never route to container slots.*

The chat-normalization gate (_normalize_chat_body) resolves hal0/chat → the registry id qwopus3.6-27b-v2, but the container's llama-server advertises the raw GGUF basename (Qwopus3.6-27B-v2-…gguf). dispatcher.dispatch() finds no upstream advertising qwopus3.6-27b-v2NoRouteFound → lemonade fall-through → 404 (lemond tries to load it from HuggingFace).

Confirmed live: routing via the bare GGUF name reached the container fine, so the only gap is the name the container advertises.

Fix

Pass llama-server --alias <registry-model-id> (from model_info._model_key, fallback [model].default) so the container advertises its hal0 id. resolved_command_for_slot surfaces it too.

After this: hal0/chatqwopus3.6-27b-v2 → matches the container's advertised alias → routes to 127.0.0.1:8102. Same for hal0/agentchadrock-35b-ace-saber.

Tests

TDD: _render_unit emits --alias, load_sync threads _model_key, resolved_command_for_slot includes it. 195 passed; ruff clean.

Gating fix for the container-runtime cutover. Relates to #652, #662.

🤖 Generated with Claude Code

Discovered during the #662 live cutover: hal0/* virtual names never reached
container slots. The chat-normalization gate resolves e.g. hal0/chat ->
"qwopus3.6-27b-v2" (registry id), but the container's llama-server advertises
the raw GGUF basename ("Qwopus3.6-27B-v2-...gguf"), so dispatcher.dispatch()
finds no matching upstream -> NoRouteFound -> lemonade fall-through (which then
404s trying to load the model from HuggingFace). Routing via the bare GGUF
name worked, confirming the only gap is the advertised name.

Fix: pass llama-server `--alias <registry-model-id>` (from model_info._model_key,
falling back to [model].default) so the container advertises its hal0 id and the
dispatcher matches it. resolved_command_for_slot shows --alias too.

Verified on CT105: bare-GGUF route reached the container; normalization emits the
exact registry id the alias now advertises.

Gating fix for the container-runtime cutover (#662).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev merged commit 1f392bf into main Jun 9, 2026
4 checks passed
@thinmintdev thinmintdev deleted the fix/container-slot-model-alias branch June 9, 2026 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant