fix(deploy): install workspace npm + service pip deps in deploy.sh#196
fix(deploy): install workspace npm + service pip deps in deploy.sh#196schutera wants to merge 2 commits into
Conversation
Senior review — PR #196
|
| Check | Result |
|---|---|
bash -n scripts/deploy.sh |
✅ pass |
make check-citations |
✅ 7 OK, 0 problems |
changed_match regex matrix (10 scenarios) |
✅ gates fire as intended |
| CI on the PR | ✅ all 9 checks green |
| Merge state | MERGEABLE / CLEAN (against main, ignoring staleness) |
shellcheck |
⏭️ not installed locally |
Couldn't test locally (needs the prod host): the actual end-to-end deploy.sh run (pm2 / systemd timer / /var/www/highfive), a real root npm ci resolve+build, and pip install resolving an onnxruntime cp310 wheel on the real 3.10 interpreter.
Reviewed with Claude Code (senior-reviewer gate). Findings are input, not a verdict — worth a second look at the P0 against your intended requirements strategy.
Reviewed + pushed a fix —
|
deploy.sh rebuilt/reloaded but never installed new deps, so a dep-adding release failed its build and auto-rolled-back. npm: it gated npm ci on backend/package-lock.json, but this is a workspaces monorepo with one ROOT lockfile, so new backend/homepage deps were missed (broke on rotating-file-stream, #178). Now a single root 'npm ci' gated on the root lockfile / any workspace package.json, before the builds; dropped the wrong per-prefix ci. pip: never ran. Now 'python3 -m pip install -r <svc>/requirements.txt' for duckdb-service/image-service when their requirements changed, into the system python3 pm2 uses; non-fatal, the post-reload health check is the real gate (graceful degradation on a missing optional dep). Also rewrites the stale production-runbook 'Updates & Redeployment' section to match reality (main branch, root npm ci, pip into system python3, all 4 pm2 apps, health checks, Python 3.10 / onnxruntime 1.23.2 note). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…R-029) The PR's runbook rewrite asserted onnxruntime is "pinned to 1.23.2" under a "Python 3.10 ceiling". After folding in main (#195 / ADR-029), the real requirements float numpy>=2.0.0 / onnxruntime>=1.23.2 / pydantic>=2.12.5 for a 3.10-3.14 matrix — a floor, not a pin. Rewrote the step-2 comment and the "Python 3.10 floor" paragraph to match, citing ADR-029, and noted that a pip upgrade is not reverted on rollback. Added a ch11 lessons-learned entry for the workspace-lockfile npm-ci miss this PR corrects (per CLAUDE.md's mandatory doc gate). Addresses the senior-reviewer P0/P2 findings on the PR. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1951ddc to
135dc1f
Compare
Problem
scripts/deploy.shrebuilds + reloads but never installs new dependencies, so a release that adds a dep fails its build and auto-rolls-back:npm --prefix backend cionly whenbackend/package-lock.jsonchanged. But this is an npm workspaces monorepo (contracts/backend/homepageshare one rootpackage-lock.json), so a new backend/homepage dep changes the root lockfile — the condition never fires,npm ciis skipped, andtsc/vitebuild against missing deps → rollback. Hit onrotating-file-stream(Turn the admin "Server Logs" panel into a real live log console #178).npm --prefix <pkg> ciis also the wrong command for workspaces.opencv/numpy/onnxruntime) are never installed beforepm2 reload.Changes —
scripts/deploy.shnpm ciwhen the rootpackage-lock.jsonor any workspacepackage.jsonchanged, run before the builds; dropped the per-prefixnpm --prefix backend|homepage ci. Fatal on failure (rollback) — a broken install means a broken build anyway.pip installforduckdb-service/image-servicewhen theirrequirements.txtchanged, into the systempython3pm2 runs them with (no venv). Non-fatal: a resolver miss (e.g. an onnxruntime with no wheel for this Python) must not block the deploy — the existing post-reload health check is the real gate (services degrade gracefully on a missing optional dep; a genuinely-required missing module crashes the reload → health fails → rollback).Changes —
docs/07-deployment-view/production-runbook.mdRewrote the stale "Updates & Redeployment" section to match reality: deploys from
main(not aproductionbranch), rootnpm ci(workspaces),pip installfor both Python services into systempython3, the Node builds,pm2 reloadof all four apps, and the four health checks. Added the Python-3.10 ceiling note (onnxruntime pinned to 1.23.2 = max cp310 wheel; ESP runs no models — ADR-028).Verification
bash -n scripts/deploy.shpasses.onnxruntime==1.23.2loads and runs the realhole_detector.onnx; image-service also boots with opencv/numpy/onnxruntime all absent (graceful no-op) — which is why the pip step is non-fatal.rotating-file-stream(Turn the admin "Server Logs" panel into a real live log console #178) failure; rootnpm ciresolves it (the dep is in the root lockfile).Pairs with #191 (Python 3.10 compat) — together they let
mainredeploy cleanly on the 3.10 host.🤖 Generated with Claude Code