Skip to content

chore: adopt production as the gated release source (closes #152)#194

Open
cofade wants to merge 2 commits into
mainfrom
claude/issue-152-production-plan-vnmbrv
Open

chore: adopt production as the gated release source (closes #152)#194
cofade wants to merge 2 commits into
mainfrom
claude/issue-152-production-plan-vnmbrv

Conversation

@cofade

@cofade cofade commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

@schutera — this implements the decision you approved for #152: production becomes the future release source.

What changed

  • scripts/deploy.sh: BRANCH="main""production"; branch-agnostic notify text; firmware-publish + prod-* tags now ride production; success notify reminds operators to merge an auto-bump back to main (cherry-pick won't restore fast-forwardability).
  • systemd units + .deploy.env.example: descriptions/comments updated from main to production.
  • Deploy docs (production-deployment.md, production-runbook.md, firmware-release.md, ch7 README.md, esp-flashing.md, CLAUDE.md): rewritten to one coherent model — production is the single gated release branch for both web services and firmware OTA; main is the integration line; a release is a fast-forward of production onto a chosen main commit. Added a release/promotion workflow + one-time host cutover; corrected the Docker-vs-PM2 attribution (scripts/deploy.sh is the bare-metal PM2 path — no Docker).
  • chapter 11: the "production drifted" entry marked RESOLVED, with the root cause recorded (main's history was rebuilt → unrelated git ancestry → production could never fast-forward → silent drift) and the fast-forwardable-deploy-branch rule.
  • New ADR-028 documenting the model, reconciliation, and the auto-bump trade-off.

Why

Investigation for #152 found the documented services deploy source (production) didn't match reality (the live scripts/deploy.sh timer pulled main), firmware OTA and services were documented as separate tracks, and — the root cause — main and production shared no common git ancestor (main's history was rebuilt 2026-05-21), so production was structurally unable to fast-forward and silently rotted. This change reconciles all of that into a single gated production release branch carrying both tracks. Addresses #152.

Operator steps NOT done in this PR (require pushes outside this branch / access to the prod host), documented in ADR-028 + production-deployment.md:

  1. Archive + force-reset: git tag archive/production-2026-05-02 origin/production (recovery point), then force-update origin/production to main's tip. After this the histories share an ancestor and all future promotions are clean fast-forwards.
  2. One-time prod-host cutover: git checkout production so the deploy timer (BRANCH=production) tracks it.

How tested

  • ESP32-CAM native (pio test -e native)
  • End-to-end (pytest tests/e2e)
  • Backend unit (Node 22 + TS)
  • image-service unit (Python 3.11)
  • duckdb-service unit (Python 3.11)
  • Homepage unit (React 19 + Vite)
  • Manual / verification (describe below)

No unit/e2e layer exercises deploy-branch configuration. Ran bash -n scripts/deploy.sh (clean) and make check-citations (7 OK, 0 problems). Passed three rounds of the senior-reviewer gate (final: clean, no P0/P1). The live deploy cutover is an operator step and cannot be exercised from CI — see the operator steps above.

Checklist

  • Tests added or updated — N/A: no test layer covers the deploy-branch/config change; verification was bash -n + make check-citations + review gate.
  • Documentation updated where applicable
  • No secrets, credentials, or large binaries committed
  • CI is green on this branch — pending (just pushed)
  • Breaking changes called out — operationally significant: switches the live deploy source. Requires the one-time reconciliation + host cutover above before the model is live; firmware prod-* tags now cut on production.

🤖 Generated with Claude Code

https://claude.ai/code/session_017drgAN84qrn61eZ1yZTdgS


Generated by Claude Code

@cofade cofade requested a review from schutera June 27, 2026 10:50
@cofade

cofade commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator Author

@schutera — this PR is a policy/governance change and needs your explicit approval, not just a code review.

The decision you're approving

From now on, all production releases — web services and firmware OTA — ship from the production branch, never from main. main stays the continuous-integration line; a release becomes a deliberate fast-forward of a reviewed main commit onto production (git push origin <sha>:production), which the on-host scripts/deploy.sh timer (BRANCH=production) then deploys. prod-* tags are cut on production. Rationale and the full model are in ADR-030.

If you'd rather keep deploying from main (option 1 in the issue), say so and I'll close this instead — the branch model is the part only you can sign off on.

Two operator steps that are yours (can't be done from this PR)

These need push access to origin/production and the prod host; they're documented in ADR-030 + production-deployment.md:

  1. Reconcile production (one-time, from a maintainer clone — must be a force-reset, the histories share no common ancestor):
    git fetch origin
    git tag archive/production-2026-05-02 origin/production   # recovery point
    git push origin origin/main:production --force            # reset production onto main's tip
  2. Cut over the prod host so the deploy timer tracks production:
    cd /var/www/highfive            # wherever the live checkout is
    git fetch origin; git checkout production; git reset --hard origin/production
    After this, every future promotion is a clean fast-forward.

Status of the PR itself

  • Merge conflict with main resolved. main had since merged its own ADR-028 (ML inference) + ADR-029 (Python matrix), colliding with this PR's ADR-028. Per the repo's "numbering across parallel branches" rule, this PR's ADR was renumbered 028 → 030 and all references updated; the ADR-index conflict was resolved keeping 028/029/030. Now MERGEABLE.
  • All local test layers green — backend, homepage (vitest + build), duckdb-service (232), image-service (98), ESP32-CAM native (291/291), make check-citations (0 problems), bash -n scripts/deploy.sh.
  • CLAUDE.md now carries a concise top-level "Critical rules" entry — never deploy/release from main — so the policy is discoverable every session, with links to the mechanics.

🤖 Generated with Claude Code

claude and others added 2 commits June 29, 2026 22:30
Reconcile the documented services deploy source with reality and unify it
with firmware OTA on a single gated `production` branch.

Investigation for #152 found three stacked problems: the docs named
`production` while the live auto-deploy pulled `main`; firmware OTA and the
services track were documented as separate; and `main`/`production` shared
no common git ancestor (main's history was rebuilt), so `production` could
never fast-forward and silently rotted.

Decision (per maintainer): `production` becomes the single gated release
branch for both web services and firmware OTA. `main` is the integration
line; a release is a fast-forward of `production` onto a chosen `main`
commit. `prod-*` tags are cut on `production`.

- scripts/deploy.sh: BRANCH main -> production; branch-agnostic notify text
- production-deployment.md: drop drift warning; add release/promotion +
  one-time host cutover section
- production-runbook.md: document the promote-then-pull model
- firmware-release.md: rewrite the branch & tag model (both tracks on
  production); replace the "known drift" callout with a history note;
  update the release-checklist commit/tag step
- chapter 11: mark the drift lesson RESOLVED; record the unrelated-history
  root cause and the fast-forwardable-deploy-branch rule
- new ADR-028; update README/esp-flashing/CLAUDE.md pointers

The branch reconciliation (archive tag + force-reset of origin/production)
and the one-time prod-host checkout are operator steps documented in
ADR-028 and production-deployment.md, to run after this lands on main.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_017drgAN84qrn61eZ1yZTdgS
The production-as-gated-release-branch model (#152 / ADR-030) was only
spelled out in the firmware-OTA section. Add a concise hard rule to the
top-level "Critical rules (do NOT violate)" list so every session knows
prod releases ship from `production`, never from `main`. Links to the
full mechanics rather than duplicating them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cofade cofade force-pushed the claude/issue-152-production-plan-vnmbrv branch from c0981d1 to ebfde09 Compare June 29, 2026 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants