Skip to content

ci: port openclaw publish, deploy, and tlonbot-smoke workflows#5931

Open
patosullivan wants to merge 5 commits into
po/move-openclaw-tlon-plugin-to-monorepofrom
po/openclaw-workflows
Open

ci: port openclaw publish, deploy, and tlonbot-smoke workflows#5931
patosullivan wants to merge 5 commits into
po/move-openclaw-tlon-plugin-to-monorepofrom
po/openclaw-workflows

Conversation

@patosullivan

Copy link
Copy Markdown
Member

Summary

Stacked on #5930. Ports the remaining three CI/CD workflows from tloncorp/openclaw-tlon so the old repo's automation can be switched off at the end of the transition window: npm publish, ship-restart deploy, and the tlonbot smoke dispatch. Companion PR in tloncorp/tlonbot (linked below) teaches the smoke harness to check out the plugin from this monorepo.

All three workflows are inert or fail-safe until ops steps land (see the secrets table): publish needs the npm Trusted Publisher flip, deploy warn-skips without its GCP secrets, and the smoke dispatch warn-skips without its token.

Changes

  1. .github/workflows/openclaw-publish.yml (port of publish.yml): tag scheme openclaw-v* (precedent: desktop-v*), or workflow_dispatch with dry_run/npm_tag inputs. Build job runs root install + plugin test/build and uploads packages/openclaw/dist; publish job stages .publish/ via prepare-publish-package.js (which resolves workspace:^ deps — added in move openclaw tlon plugin to the monorepo #5930), hard-fails on any leftover workspace: spec, then npm publish --provenance.
    • Deviation: build job uses Node 22 (repo .nvmrc) instead of the old repo's Node 24 — the monorepo install rebuilds better-sqlite3, which has no Node 24 prebuilds. The publish job keeps Node 24 for npm ≥ 11.5 (OIDC trusted publishing); it does no installs.
  2. .github/workflows/openclaw-deploy.yml (port of deploy.yml): pushes to develop touching packages/openclaw/src/** restart internal ships; tlawn (prod) is workflow_dispatch-only — the old stable → tlawn auto-trigger is intentionally dropped so prod restarts stay a deliberate action, decoupled from the develop→master release sync. Restart job body is otherwise verbatim (GKE + OVH contexts, jq ship filters).
    • Deviation: a first step warn-skips the whole job when GCP_SA_KEY is unset, so develop pushes stay green until secrets are provisioned.
  3. .github/workflows/openclaw-dispatch-tlonbot-smoke.yml (port of dispatch-tlonbot-smoke.yml): on develop pushes touching packages/openclaw/**, dispatches tlonbot-smoke-test with the commit SHA, now also sending repo: tloncorp/tlon-apps and path: packages/openclaw so tlonbot checks out the monorepo. Keeps the warn-skip when the token secret is unset.

Companion PR: tloncorp/tlonbot#97 — both plugin consumers (smoke harness and production tlawn.py) accept the monorepo layout via env vars (TLON_PLUGIN_REPO / TLON_PLUGIN_SUBPATH): sparse blobless checkout, workspace-dep resolution before install, repo-aware ref defaults (develop/master for tlon-apps). Legacy defaults everywhere, so merging changes nothing until the deployment env is flipped.

How did I test?

  • actionlint (via rhysd/actionlint Docker image) passes on all three workflows.
  • Trigger semantics, job bodies, and if: gates diffed against the old repo's workflows; deviations are only the ones listed above.
  • Live validation is structurally limited pre-merge: workflow_dispatch only appears once the workflow exists on the default branch, and push triggers only fire on develop. Post-merge runbook below covers the first live exercises (publish dry-run, manual internal restart).

Risks and impact

  • Safe to rollback without consulting PR author? Yes
  • Affects important code area:
    • Other: CI/CD workflows only (no app or package code)

Secrets / ops checklist (none block merging; workflows skip gracefully)

Item For Provisioning
GCP_SA_EMAIL, GCP_SA_KEY openclaw-deploy Same SA the old repo uses (project prod-f0181862); mint a fresh key via IAM. The existing GCP_SERVICE_KEY secret is a different SA (glob uploads) — don't reuse.
KUBECONFIG_B64 openclaw-deploy (OVH) From the OVH cluster admins (base64 kubeconfig with admin@ovh-oregon-{1,2} contexts).
TLONBOT_DISPATCH_TOKEN smoke dispatch Fine-grained PAT on tloncorp/tlonbot with Contents: read & write (repository_dispatch needs write; the read-only TLONBOT_TOKEN PAT won't work). Add only after the companion tlonbot PR lands.
npm Trusted Publisher openclaw-publish npm-side setting: re-point @tloncorp/openclaw from tloncorp/openclaw-tlon/publish.yml to tloncorp/tlon-apps/openclaw-publish.yml. Until flipped, only dry-runs succeed. Never both repos publishable at once.

Transition note: production still deploys old-repo code until the env flip

tlonbot's production tlawn.py re-fetches the plugin from openclaw-tlon git (master for internal, stable for tlawn) on every pod restart. The companion tlonbot PR adds env-gated monorepo support; until TLON_PLUGIN_REPO/TLON_PLUGIN_SUBPATH are set on the ship deployments, restarts — from either repo's deploy workflow — keep deploying openclaw-tlon code, so plugin changes landing only in this monorepo do not reach ships. Do the env flip early in the transition (one internal ship first), not at freeze.

First-release runbook (post-merge)

  1. Land the companion tlonbot PR; add TLONBOT_DISPATCH_TOKEN.
  2. Provision deploy secrets; one manual workflow_dispatch mode=internal, observe pod restarts, before trusting the push trigger.
  3. openclaw-publish.yml dispatch with dry_run=true; verify .publish contents in logs.
  4. Flip the npm trusted publisher; disable the old repo's publish.yml the same day.
  5. Bump plugin to 0.4.4, tag openclaw-v0.4.4, publish to a test dist-tag first (npm_tag: dist-test), smoke-install, then promote to latest.
  6. Flip production to the monorepo: set TLON_PLUGIN_REPO=tloncorp/tlon-apps + TLON_PLUGIN_SUBPATH=packages/openclaw on one internal ship's deployment, verify clone/build/start, then roll out; disable the old repo's deploy.yml once proven (avoid double restarts).

Rollback plan

Revert

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3a77a05f9c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/openclaw-deploy.yml
Comment thread .github/workflows/openclaw-publish.yml Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ae4c31b350

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/openclaw-deploy.yml
Comment thread .github/workflows/openclaw-publish.yml
…odel in deploy trigger

Publish: verify openclaw-v* tags match packages/openclaw's package.json
version before building (same guard as tlon-skill-publish.yml), so a
mistagged commit can't publish a mislabeled artifact.

Deploy: record why packages/api and packages/tlon-skill are deliberately
not in the path filter — ships install published npm versions of the
workspace deps (resolve-workspace-deps.mjs --registry), so develop merges
to those packages change nothing on restart; dep changes reach production
via npm publish + restart.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d1608f1e3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/openclaw-publish.yml Outdated
…freshness

The publish job previously staged .publish/ from a fresh checkout, so a
commit whose checked-in openclaw.plugin.json lagged src/config-schema.ts
could publish a stale manifest even though the build regenerated the
right one (codex finding). Stage .publish/ in the build job next to the
build outputs and ship it as the artifact; the publish job now has no
checkout at all — it verifies and publishes the staged package verbatim.

Also fail openclaw CI when the checked-in openclaw.plugin.json doesn't
match what the build regenerates, so manifest drift can't land on
develop in the first place.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc5ded5951

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/openclaw-publish.yml
…load

actions/upload-artifact treats everything under a dot-directory as
hidden and excludes it by default, so uploading .publish/ produced an
empty artifact and the publish job had nothing to verify or publish
(codex finding). Safe to include: .publish/ is exactly the public npm
package contents. Also set if-no-files-found: error so an empty staging
dir fails the build job instead of surfacing downstream.

@latter-bolden latter-bolden left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, @yapishu might want to take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants