feat: profiles page + container edit drawer + resolved_command (#658)#671
Merged
Conversation
…ackend (#658) Backend - container.py: add `resolved_command_for_slot()` — pure helper returning the llama-server argv list (image + port + model + profile flags) without fabricating anything client-side. Shared source of truth with _render_unit. - slots.py: _container_state_enrichment now emits `runtime`, `profile`, `image` (from profiles catalog), and `resolved_command` for every container slot on GET /api/slots. Lemonade slots are unaffected. - tests: 3 new tests covering runtime/profile/image/resolved_command fields + flag presence. Frontend — Profiles page (new) - useProfiles hook + barrel export + profiles endpoint constant. - ProfilesView: lists profiles by intent label (MoE agents · ROCmFP4 · ~52.8 tok/s, Dense chat + MTP · ~24.4 tok/s, Vulkan std fallback); custom profiles derive label from image tag + mtp flag. - nav item (Profiles, between Slots and Models), route "profiles" wired. - CSS: .pf-card / .pf-intent / .pf-meta / .pf-flags. Frontend — EditSlotDrawer (container branching) - Provider strip: shows image tag instead of "lemonade" for container slots. - Backend strip: shows profile + image_status instead of declared/actual. - Mismatch banner: hidden for container slots. - Device + Runtime Backend selectors: replaced with read-only profile strip. - n_gpu_layers, rope_freq_base, extra_args: read-only ("defined by profile") for container slots; lemonade slots unchanged. - idle_timeout_s + workers: hidden for container slots (no lemond idle-unload). - ctx_size warning: reworded to "⟳ restarts the container (~model-load seconds)". - Flags preview: container slots show backend-provided `resolved_command` (real podman argv); lemonade slots keep `effectiveFlagsFor()` unchanged. Frontend — CreateSlotModal (container compat) - Runtime selector (lemonade/container) added; profile picker replaces device selector for container slots. - Model filter: container slots accept any model of the right type. - canSave requires profile for container slots. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
npm approve-scripts adds this entry so `npm run build` can run esbuild's postinstall step in clean worktrees. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Backend now emits resolved_command[] on container slots via _container_state_enrichment. Wire through TypeScript interface and normalizeSlot() passthrough so the EditSlotDrawer can read it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- resolved_command_for_slot: model lives in cfg["model"]["default"]
(nested TOML table), not at top-level; str() of the dict was emitting
'--model {default: ...}'. Now extracts the string correctly.
- onSaveClick: container slots no longer send n_gpu_layers, rope_freq_base,
device, idle_timeout_s, workers, or llamacpp_args — those are defined by
the profile and must not be overwritten on save.
- Test: add --model assertion so the model-token bug can't regress silently.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…iner drawer - profiles-page-v3.spec.ts: 9 tests covering the Profiles nav item, ProfileCard rendering, intent labels, and resolved_command correctness - mock-data.ts: add MOCK_DATA.profiles (3 seed profiles) - apiMock.ts: intercept /api/profiles in installDefaultMocks - All 9 Playwright tests pass (23s, chromium) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
thinmintdev
commented
Jun 8, 2026
thinmintdev
left a comment
Contributor
Author
There was a problem hiding this comment.
VERDICT: APPROVE-ON-GREEN — merge-ready once the python matrix + γ-suite pass
ui already green; python (3.11/3.12) + γ-suite pending. Solid backend + UI work; lemond parity is preserved (the #668 lesson). Two non-blocking notes below.
SPEC (vs #658): ACs met.
- resolved_command_for_slot() correctly reads the model from cfg["model"]["default"] (nested TOML table), not str(cfg["model"]) — the dict-repr bug is fixed and guarded by test asserting "--model llama-3b" in the joined argv. Image + flags (via resolve_profile_flags) + port all assembled correctly; podman preamble omitted by design (needs root GIDs, not useful for debugging).
- /api/slots emits runtime/profile/image/resolved_command on both list + detail paths for container slots; lemonade slots get none of these (test_lemonade_slot_has_no_runtime_container_fields). Graceful degradation to None on profile-lookup failure / absent profile.
- Profiles page: useProfiles hook + ProfilesView with intent labels (MoE agents · ROCmFP4 · ~52.8 / Dense chat + MTP · ~24.4 / Vulkan std fallback), custom-profile fallback to image tag + MTP. Nav/route/TopBar wired. Graceful loading/error/empty states.
- Container edit drawer: profile picker replaces device/backend; n_gpu_layers/rope_freq_base/extra_args read-only with "defined by profile" hint; idle_timeout_s/workers hidden; provider→image tag, declared/actual→profile+image_status, mismatch banner hidden; resolved_command shown instead of effectiveFlagsFor; ctx_size restart copy reworded. Create modal: runtime selector + profile picker, canSave gated on profile for container.
- Container SAVE body correctly excludes profile-owned knobs: isContainerSave gates out n_gpu_layers/device/rope_freq_base/idle_timeout_s/workers/llamacpp_args — only ctx_size + default sent.
STANDARDS — lemond parity PRESERVED (the #668 lesson):
- Every container change in EditSlotDrawer/CreateSlotModal is gated on slot.runtime==="container"; the lemond
elsebranch contains the ORIGINAL code verbatim (Device row, Runtime Backend IIFE, idle/workers rows, extra_args hint, effectiveFlagsFor preview). - Lemond save body is byte-equivalent: !isContainerSave && X → X, and the ternary spreads resolve to the original {device}/{n_gpu_layers} for lemond. Verified.
- Create modal: runtime defaults to "lemonade", so an operator who ignores the new Runtime dropdown gets identical behavior (device selector + device in body + original rocmfp4 model filter). The added Runtime dropdown is additive, not a regression.
- effectiveFlagsFor() correctly KEPT for lemond (issue AC "lemond keeps its form" outranks the audits "delete it") — conscious, documented tradeoff: lemond still shows the approximate client-side preview while container shows the real resolved_command.
- No fabricated data; container fields degrade to null/"—" when absent.
NON-BLOCKING NOTES (P2 / maintainers call):
- resolved_command "--model" token shows the model ID (cfg["model"]["default"], e.g. "llama-3b"), NOT the absolute .gguf path the container actually launches with. The real load path (_render_unit via load_sync) uses _resolve_model_path(model_info) → /mnt/ai-models/llama-3b.gguf, so the displayed "real podman argv" diverges from reality on the model arg. The unit test enshrines the id form (--model llama-3b) while the Playwright fixture hand-writes a full .gguf path — inconsistent, so nothing pins the divergence. Its a recognizable read-only preview, AC arguably met in spirit; decide id-vs-path. (Note: the absent --ctx-size is FAITHFUL to #666s _render_unit, which also omits it — not a #671 bug.)
- Hot read-path cost: _container_state_enrichment now calls load_profiles_config() AND resolved_command_for_slot() (which calls load_profiles_config again) per container slot per /api/slots poll — two TOML disk-reads/parses each, on top of the #667 systemctl+health probes. Cache or pass the catalog through in P2.
- Minor:
except (KeyError, Exception)is redundant (== except Exception); container create hardcodes device:"gpu-rocm" even for a vulkan-std profile (cosmetic — container ignores device, but writes mildly misleading TOML). One-liners.
CONTEXT NOTE: #668 merged with my prior CHANGES open (isSlotLive/stateChipClass-warming parity). #671 branches from main and inherits those but does NOT touch them — out of scope here; they remain a #668 follow-up.
MERGE-READY: yes, on green python + γ. The three notes are P2 polish / a maintainer id-vs-path decision, not blockers. Unblocks #659/#660.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements issue #658 — Profiles page, container-slot edit drawer, and backend
resolved_command.Backend (
src/hal0/providers/container.py,src/hal0/api/routes/slots.py)resolved_command_for_slot(): new pure helper that builds the real llama-server argv for a container slot (image tag +--host,--port,--model, profile flags). Omits the podman preamble (requires root/GIDs, not useful for debugging).cfg["model"]["default"](nested TOML table) — was erroneouslystr(cfg["model"])which emitted a dict repr like{'default': 'llama-3b'}._container_state_enrichment(): emitsruntime,profile,image,resolved_commandon every container slot list entry. Applied on both the list (GET /api/slots) and detail (GET /api/slots/{name}) paths.Frontend
ui/src/dash/profiles.jsx):ProfilesViewcomponent withProfileCardper profile. Intent labels:moe-rocmfp4→ "MoE agents · ROCmFP4 · ~52.8 tok/s",dense-mtp-rocmfp4→ "Dense chat + MTP · ~24.4 tok/s",vulkan-std→ "Vulkan std (fallback)". Image tag + profile name as secondary metadata. Registered in nav, routing, and TopBar.useProfileshook (ui/src/api/hooks/useProfiles.ts): queriesGET /api/profiles, 60s stale time.ui/src/dash/slot-modals.jsx):runtime=containerslots.n_gpu_layers,rope_freq_base,extra_argsare read-only with "defined by profile" hint.idle_timeout_sandworkershidden for container slots.ctx_size+default— explicitly excludesn_gpu_layers,device,rope_freq_base,idle_timeout_s,workers,llamacpp_args(owned by profile).resolved_command(real argv); lemond slots keepeffectiveFlagsFor()preview unchanged.{ runtime: "container", profile }.Decision notes
effectiveFlagsFor()is kept and used unchanged for lemond slots — only container slots useresolved_command. The issue spec says "DELETE effectiveFlagsFor()" but also "lemond unchanged" — keeping it for lemond is the correct interpretation.resolved_commandis on the list path (via_container_state_enrichment) becauseEditSlotDrawergets itsslotfrom the list payload throughuseSlots().Tests
tests/api/test_slots_container_state.py: 10 tests — 3 new forruntime/profile/image/resolved_commandfields including a--model llama-3bassertion that guards the nested-dict bug.ui/tests/e2e/specs/profiles-page-v3.spec.ts: 9 Playwright tests for profiles page and container drawer.Closes #658
Parent epic: #652