Skip to content

feat: profiles page + container edit drawer + resolved_command (#658)#671

Merged
thinmintdev merged 5 commits into
mainfrom
issue-658-profiles
Jun 8, 2026
Merged

feat: profiles page + container edit drawer + resolved_command (#658)#671
thinmintdev merged 5 commits into
mainfrom
issue-658-profiles

Conversation

@thinmintdev

@thinmintdev thinmintdev commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements issue #658 — Profiles page, container-slot edit drawer, and backend resolved_command.

Backend (src/hal0/providers/container.py, src/hal0/api/routes/slots.py)

  • resolved_command_for_slot(): new pure helper that builds the real llama-server argv for a container slot (image tag + --host, --port, --model, profile flags). Omits the podman preamble (requires root/GIDs, not useful for debugging).
  • Model extraction: correctly reads from cfg["model"]["default"] (nested TOML table) — was erroneously str(cfg["model"]) which emitted a dict repr like {'default': 'llama-3b'}.
  • _container_state_enrichment(): emits runtime, profile, image, resolved_command on every container slot list entry. Applied on both the list (GET /api/slots) and detail (GET /api/slots/{name}) paths.
  • Lemonade slots: unaffected — no new fields added.

Frontend

  • Profiles page (ui/src/dash/profiles.jsx): ProfilesView component with ProfileCard per profile. Intent labels: moe-rocmfp4 → "MoE agents · ROCmFP4 · ~52.8 tok/s", dense-mtp-rocmfp4 → "Dense chat + MTP · ~24.4 tok/s", vulkan-std → "Vulkan std (fallback)". Image tag + profile name as secondary metadata. Registered in nav, routing, and TopBar.
  • useProfiles hook (ui/src/api/hooks/useProfiles.ts): queries GET /api/profiles, 60s stale time.
  • Edit drawer container branch (ui/src/dash/slot-modals.jsx):
    • Profile picker replaces Device/Backend selectors for runtime=container slots.
    • n_gpu_layers, rope_freq_base, extra_args are read-only with "defined by profile" hint.
    • idle_timeout_s and workers hidden for container slots.
    • Save body: container slots only send ctx_size + default — explicitly excludes n_gpu_layers, device, rope_freq_base, idle_timeout_s, workers, llamacpp_args (owned by profile).
    • Flags section: container slots show backend-provided resolved_command (real argv); lemond slots keep effectiveFlagsFor() preview unchanged.
    • ctx_size restart warning reworded: "⟳ restarts the container (~model-load seconds)" for container slots.
    • Backend mismatch banner hidden for container slots.
  • Create modal: runtime selector (lemonade/container); profile picker for container runtime; creates with { runtime: "container", profile }.

Decision notes

  • effectiveFlagsFor() is kept and used unchanged for lemond slots — only container slots use resolved_command. The issue spec says "DELETE effectiveFlagsFor()" but also "lemond unchanged" — keeping it for lemond is the correct interpretation.
  • resolved_command is on the list path (via _container_state_enrichment) because EditSlotDrawer gets its slot from the list payload through useSlots().

Tests

  • tests/api/test_slots_container_state.py: 10 tests — 3 new for runtime/profile/image/resolved_command fields including a --model llama-3b assertion that guards the nested-dict bug.
  • ui/tests/e2e/specs/profiles-page-v3.spec.ts: 9 Playwright tests for profiles page and container drawer.
  • Full API suite: 787 passed, 3 skipped, 0 failed.
  • UI build: clean (vite, 132 modules).
  • ruff: all branch-changed files pass.

Closes #658
Parent epic: #652

thinmintdev and others added 5 commits June 8, 2026 08:14
…ackend (#658)

Backend
- container.py: add `resolved_command_for_slot()` — pure helper returning
  the llama-server argv list (image + port + model + profile flags) without
  fabricating anything client-side. Shared source of truth with _render_unit.
- slots.py: _container_state_enrichment now emits `runtime`, `profile`,
  `image` (from profiles catalog), and `resolved_command` for every container
  slot on GET /api/slots. Lemonade slots are unaffected.
- tests: 3 new tests covering runtime/profile/image/resolved_command
  fields + flag presence.

Frontend — Profiles page (new)
- useProfiles hook + barrel export + profiles endpoint constant.
- ProfilesView: lists profiles by intent label (MoE agents · ROCmFP4 ·
  ~52.8 tok/s, Dense chat + MTP · ~24.4 tok/s, Vulkan std fallback);
  custom profiles derive label from image tag + mtp flag.
- nav item (Profiles, between Slots and Models), route "profiles" wired.
- CSS: .pf-card / .pf-intent / .pf-meta / .pf-flags.

Frontend — EditSlotDrawer (container branching)
- Provider strip: shows image tag instead of "lemonade" for container slots.
- Backend strip: shows profile + image_status instead of declared/actual.
- Mismatch banner: hidden for container slots.
- Device + Runtime Backend selectors: replaced with read-only profile strip.
- n_gpu_layers, rope_freq_base, extra_args: read-only ("defined by profile")
  for container slots; lemonade slots unchanged.
- idle_timeout_s + workers: hidden for container slots (no lemond idle-unload).
- ctx_size warning: reworded to "⟳ restarts the container (~model-load seconds)".
- Flags preview: container slots show backend-provided `resolved_command`
  (real podman argv); lemonade slots keep `effectiveFlagsFor()` unchanged.

Frontend — CreateSlotModal (container compat)
- Runtime selector (lemonade/container) added; profile picker replaces
  device selector for container slots.
- Model filter: container slots accept any model of the right type.
- canSave requires profile for container slots.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
npm approve-scripts adds this entry so `npm run build` can run
esbuild's postinstall step in clean worktrees.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Backend now emits resolved_command[] on container slots via
_container_state_enrichment. Wire through TypeScript interface and
normalizeSlot() passthrough so the EditSlotDrawer can read it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- resolved_command_for_slot: model lives in cfg["model"]["default"]
  (nested TOML table), not at top-level; str() of the dict was emitting
  '--model {default: ...}'. Now extracts the string correctly.
- onSaveClick: container slots no longer send n_gpu_layers, rope_freq_base,
  device, idle_timeout_s, workers, or llamacpp_args — those are defined by
  the profile and must not be overwritten on save.
- Test: add --model assertion so the model-token bug can't regress silently.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…iner drawer

- profiles-page-v3.spec.ts: 9 tests covering the Profiles nav item,
  ProfileCard rendering, intent labels, and resolved_command correctness
- mock-data.ts: add MOCK_DATA.profiles (3 seed profiles)
- apiMock.ts: intercept /api/profiles in installDefaultMocks
- All 9 Playwright tests pass (23s, chromium)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@thinmintdev thinmintdev left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VERDICT: APPROVE-ON-GREEN — merge-ready once the python matrix + γ-suite pass

ui already green; python (3.11/3.12) + γ-suite pending. Solid backend + UI work; lemond parity is preserved (the #668 lesson). Two non-blocking notes below.

SPEC (vs #658): ACs met.

  • resolved_command_for_slot() correctly reads the model from cfg["model"]["default"] (nested TOML table), not str(cfg["model"]) — the dict-repr bug is fixed and guarded by test asserting "--model llama-3b" in the joined argv. Image + flags (via resolve_profile_flags) + port all assembled correctly; podman preamble omitted by design (needs root GIDs, not useful for debugging).
  • /api/slots emits runtime/profile/image/resolved_command on both list + detail paths for container slots; lemonade slots get none of these (test_lemonade_slot_has_no_runtime_container_fields). Graceful degradation to None on profile-lookup failure / absent profile.
  • Profiles page: useProfiles hook + ProfilesView with intent labels (MoE agents · ROCmFP4 · ~52.8 / Dense chat + MTP · ~24.4 / Vulkan std fallback), custom-profile fallback to image tag + MTP. Nav/route/TopBar wired. Graceful loading/error/empty states.
  • Container edit drawer: profile picker replaces device/backend; n_gpu_layers/rope_freq_base/extra_args read-only with "defined by profile" hint; idle_timeout_s/workers hidden; provider→image tag, declared/actual→profile+image_status, mismatch banner hidden; resolved_command shown instead of effectiveFlagsFor; ctx_size restart copy reworded. Create modal: runtime selector + profile picker, canSave gated on profile for container.
  • Container SAVE body correctly excludes profile-owned knobs: isContainerSave gates out n_gpu_layers/device/rope_freq_base/idle_timeout_s/workers/llamacpp_args — only ctx_size + default sent.

STANDARDS — lemond parity PRESERVED (the #668 lesson):

  • Every container change in EditSlotDrawer/CreateSlotModal is gated on slot.runtime==="container"; the lemond else branch contains the ORIGINAL code verbatim (Device row, Runtime Backend IIFE, idle/workers rows, extra_args hint, effectiveFlagsFor preview).
  • Lemond save body is byte-equivalent: !isContainerSave && X → X, and the ternary spreads resolve to the original {device}/{n_gpu_layers} for lemond. Verified.
  • Create modal: runtime defaults to "lemonade", so an operator who ignores the new Runtime dropdown gets identical behavior (device selector + device in body + original rocmfp4 model filter). The added Runtime dropdown is additive, not a regression.
  • effectiveFlagsFor() correctly KEPT for lemond (issue AC "lemond keeps its form" outranks the audits "delete it") — conscious, documented tradeoff: lemond still shows the approximate client-side preview while container shows the real resolved_command.
  • No fabricated data; container fields degrade to null/"—" when absent.

NON-BLOCKING NOTES (P2 / maintainers call):

  1. resolved_command "--model" token shows the model ID (cfg["model"]["default"], e.g. "llama-3b"), NOT the absolute .gguf path the container actually launches with. The real load path (_render_unit via load_sync) uses _resolve_model_path(model_info) → /mnt/ai-models/llama-3b.gguf, so the displayed "real podman argv" diverges from reality on the model arg. The unit test enshrines the id form (--model llama-3b) while the Playwright fixture hand-writes a full .gguf path — inconsistent, so nothing pins the divergence. Its a recognizable read-only preview, AC arguably met in spirit; decide id-vs-path. (Note: the absent --ctx-size is FAITHFUL to #666s _render_unit, which also omits it — not a #671 bug.)
  2. Hot read-path cost: _container_state_enrichment now calls load_profiles_config() AND resolved_command_for_slot() (which calls load_profiles_config again) per container slot per /api/slots poll — two TOML disk-reads/parses each, on top of the #667 systemctl+health probes. Cache or pass the catalog through in P2.
  3. Minor: except (KeyError, Exception) is redundant (== except Exception); container create hardcodes device:"gpu-rocm" even for a vulkan-std profile (cosmetic — container ignores device, but writes mildly misleading TOML). One-liners.

CONTEXT NOTE: #668 merged with my prior CHANGES open (isSlotLive/stateChipClass-warming parity). #671 branches from main and inherits those but does NOT touch them — out of scope here; they remain a #668 follow-up.

MERGE-READY: yes, on green python + γ. The three notes are P2 polish / a maintainer id-vs-path decision, not blockers. Unblocks #659/#660.

@thinmintdev thinmintdev merged commit 235c9a0 into main Jun 8, 2026
4 checks passed
@thinmintdev thinmintdev deleted the issue-658-profiles branch June 8, 2026 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dashboard: Profiles page + Create/Edit profile-picker + resolved-command preview

1 participant