Skip to content

feat: container image-pull progress SSE + UI pull bar#673

Merged
thinmintdev merged 1 commit into
mainfrom
issue-659-pull
Jun 8, 2026
Merged

feat: container image-pull progress SSE + UI pull bar#673
thinmintdev merged 1 commit into
mainfrom
issue-659-pull

Conversation

@thinmintdev

Copy link
Copy Markdown
Contributor

Summary

  • Backend: image_status (present|pulling|missing) per container slot in /api/slots; POST /api/slots/{name}/pull starts a background docker/podman pull; GET /api/slots/{name}/pull/stream SSE streams layer progress (layer N/M) until terminal
  • Frontend: SlotImagePullBar in SlotCard (shows when image_status==="pulling"); ImagePullBar with layer N/M label in ErrorSlotCardBanner; Re-pull button in error banner wired to real pull endpoint (was dead __hal0Toast); useSlotImagePull() hook mirrors usePullJob pattern; @keyframes hal0-indeterminate CSS animation
  • Tests: 12 targeted pytest tests covering image_status enrichment, pull route idempotency, SSE stream terminal frames, image_present() unit tests with fake runtime scripts, pull_image_stream() completed/failed frames

Acceptance criteria

  • Pull SSE streams layer progress; image_status surfaced (present|pulling|missing)
  • UI shows a pull bar while the profile image downloads
  • Graceful when image already present (stream emits state=present immediately)
  • aria-live=polite on progress text; labels end "…"; distinct from model download
  • Re-pull wired to real endpoint (not dead toast)

Test plan

  • PYTHONPATH=src pytest tests/api/test_slots_image_pull.py -q — 12 pass
  • ruff check src/hal0/api/routes/slots.py src/hal0/providers/container.py tests/api/test_slots_image_pull.py — clean
  • cd ui && npm run build — clean
  • On CT105 with a container slot: POST /api/slots/{name}/pull + GET /api/slots/{name}/pull/stream should stream docker layer lines and complete
  • Dashboard: container slot card shows indeterminate bar while pulling

Closes #659
Parent: #652

🤖 Generated with Claude Code

Backend:
- ContainerProvider.image_present(): inspect syscall for present/missing check
- ContainerProvider.pull_image_stream(): async generator yielding layer progress
  from docker/podman pull stdout (Pulling fs layer → Pull complete heuristic)
- _container_state_enrichment: image_status (present|pulling|missing) per slot,
  reads active slot_pull_jobs registry first, falls back to image_present()
- POST /api/slots/{name}/pull (202): start background image pull, idempotent
- GET /api/slots/{name}/pull/stream (SSE): 0.5s poll loop; terminal frame
  (present|missing) when no pull active; state|layer|total_layers frames in-flight
- app.state.slot_pull_jobs registry initialised in lifespan
- endpoints.ts: slotPull + slotPullStream added

Frontend:
- useSlotImagePull() hook (useSlots.ts): mirrors usePullJob pattern, owns one
  EventSource; start(name) POSTs then opens SSE stream; invalidates ['slots']
  on terminal state
- SlotImagePullBar (slots.jsx): shows when image_status==="pulling"; aria-live
  polite; label ends "…"; indeterminate bar animation
- ImagePullBar (slot-modals.jsx): layer N/M progress + pct bar for active pull
- ErrorSlotCardBanner Re-pull button: wired to pull.start(slot.name), disabled
  while in-flight, shows inline ImagePullBar during/after pull
- @Keyframes hal0-indeterminate in dashboard.css

Tests (tests/api/test_slots_image_pull.py): 12 tests covering image_status
fields, POST pull idempotency, SSE stream terminal frames, image_present unit
tests with fake runtime scripts, pull_image_stream completed/failed frames.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@thinmintdev thinmintdev left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VERDICT: APPROVE-ON-GREEN — merge-ready once the python Test substep + γ-suite pass

ui green; python Lint + Format check already GREEN on both 3.11/3.12 (confirmed — the #664 lint-failure class does NOT recur; note the docstring claims "marked ARG001 to suppress the linter" on _run_image_pull without a visible # noqa, but Lint passed so ruffs ARG ruleset isnt enabled here — moot). Only the Test substep remains pending → APPROVE-ON-GREEN. No overlap with #672 (capacity.py vs slots.py — they merge in parallel cleanly).

SPEC (vs #659): ACs met.

  • image_status (present|pulling|missing) added in _container_state_enrichment, which already skips lemond slots — so image_status is CONTAINER-ONLY; lemond slots never get it (verified; tests cover present/missing/pulling + lemond absence). Pulling is read from slot_pull_jobs without an extra inspect syscall; otherwise inspect via image_present (executor-dispatched), error→"missing".
  • POST /{name}/pull: 202, idempotent (existing pulling job → resumed:true, no second pull), 404 for unknown slot (await sm.status(name) first), BadRequest when no profile/image. BackgroundTasks runs _run_image_pull.
  • GET /{name}/pull/stream SSE: immediate snapshot, per-layer frames, terminal frame; graceful present/missing terminal when no job active. app.state.slot_pull_jobs registered in lifespan.
  • UI: SlotImagePullBar (card, image_status==="pulling", aria-live polite, indeterminate hal0-indeterminate), ImagePullBar (banner, N/M % + completed/failed), useSlotImagePull hook (POST→SSE, terminal closes stream + invalidates slots), Re-pull wired to pull.start (was dead __hal0Toast).

STANDARDS — clean:

  • Lemond unaffected: image_status only on container slots; no new lemond fields. No fabricated data on lemond cards.
  • No SSE leak: on client disconnect, CancelledError propagates through the await asyncio.sleep(0.5) and the generator stops; the background task OWNS the subprocess and reaps it via finally: proc.kill(), so the pull continues to completion (intended — UI disconnect shouldnt abort a pull) with no subprocess leak.
  • Idempotency + 404 verified by tests; image_present/pull_image_stream unit-tested with fake-runtime scripts (completed + failed frames).

NON-BLOCKING NOTES:

  1. Layer N/M counting is a logic bug on BOTH runtimes (re: "no fabricated data") — not just unverified. The heuristic counts lifecycle lines as layers: one downloaded layer emits "Pulling fs layer" + "Waiting" + "Verifying Checksum" (each +1 total_layers) AND "Download complete" + "Pull complete" (each +1 done_layers), so N and M inflate by different factors → the ImagePullBar percentage is wrong on docker. On podman (the PREFERRED runtime) those keywords mostly dont match → total_layers≈0 → pct=null → indeterminate. "Downloading" progress lines are ignored, so the count doesnt advance during the actual ~6 GB transfer. Recommend: drop or caveat the N/M fraction in ImagePullBar and keep it indeterminate like the cards SlotImagePullBar (which is already honest). NON-BLOCKING because the AC-relevant card surface is already indeterminate and the banner self-heals to "Image ready".
  2. _ImagePullJob class docstring describes "an asyncio.Event used to wake SSE subscribers" but the implementation POLLS (asyncio.sleep(0.5) + layer-delta compare); no Event in slots. Stale doc — cosmetic.
  3. Minor UI/API (single-user LAN-acceptable): EventSource onerror immediately sets state="failed"+closes, so a transient network blip shows "failed" though the background pull continues (self-heals on next /api/slots poll); completed _ImagePullJob entries linger in slot_pull_jobs (bounded — one per slot, overwritten on re-pull); the stream endpoint emits "missing" for an unknown slot while POST returns 404 (minor inconsistency).

MERGE-READY: yes, on green python Test + γ. The notes are P2 polish (#1 worth a quick follow-up to stop showing a wrong layer fraction). Closes the #659 slice and the #652 container-runtime epic UI chain.

@thinmintdev thinmintdev merged commit 4dcffbe into main Jun 8, 2026
4 checks passed
@thinmintdev thinmintdev deleted the issue-659-pull branch June 8, 2026 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dashboard: container image-pull progress (SSE)

1 participant