ci(build-release): gate smoke to tag/schedule/dispatch only#878
ci(build-release): gate smoke to tag/schedule/dispatch only#878danielmeppiel merged 2 commits intomainfrom
Conversation
Push-time smoke in build-release.yml's build-and-test job (Linux x86_64, Linux arm64, Windows) duplicated the merge-time smoke gate already enforced by ci-integration.yml on the same SHA content, while burning ~15 redundant codex-binary downloads per active day and amplifying network-flake exposure. Smoke now runs only at promotion boundaries: - tags (pre-ship release gate; only validation tag-cut releases receive) - schedule (nightly drift catch for upstream openai/codex URL changes) - workflow_dispatch (manual safety net) Push-to-main retains unit tests on all build-and-test platforms for platform-regression signal; smoke coverage on Linux at merge_group time (ci-integration.yml) and on Linux x86_64 nightly (ci-runtime.yml) is unchanged. Multi-platform smoke (arm64 + Windows) shifts from per-push to per-tag, narrowing the time-to-detection window for platform-specific regressions in scripts/runtime/setup-codex.sh by hours-to-days but trading that for a meaningful reduction in network noise. The gating expression matches the existing canonical pattern used by the macOS Intel/ARM jobs and integration-tests/release-validation jobs in this same workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adjusts the release pipeline to reduce redundant CI work by running runtime smoke tests only at promotion boundaries, while keeping push-to-main unit test coverage for platform regressions.
Changes:
- Gate
build-release.ymlbuild-and-testsmoke tests to tag/schedule/workflow_dispatch events only. - Split the prior combined test invocation into always-on unit tests plus conditionally-run smoke tests.
- Document the new behavior in CI/CD instructions and add a changelog entry.
Show a summary per file
| File | Description |
|---|---|
CHANGELOG.md |
Adds an Unreleased entry describing the smoke-test gating change. |
.github/workflows/build-release.yml |
Moves smoke tests behind a promotion-boundary conditional; keeps unit tests on push. |
.github/instructions/cicd.instructions.md |
Updates CI/CD workflow documentation to reflect the new smoke-test gating. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 2
|
|
||
| ### Changed | ||
|
|
||
| - CI: smoke tests in `build-release.yml`'s `build-and-test` job (Linux x86_64, Linux arm64, Windows) are now gated to promotion boundaries (tag/schedule/dispatch) instead of running on every push to main. Push-time smoke duplicated the merge-time smoke gate in `ci-integration.yml` and burned ~15 redundant codex-binary downloads/day. Tag-cut releases still run smoke as a pre-ship gate; nightly catches upstream codex URL drift; merge-time still gates merges into main. |
There was a problem hiding this comment.
Changelog entries in this repo appear to require a PR reference at the end (e.g., "(#123)"). This new Unreleased entry does not include a PR number, which will break the established changelog format and makes it harder to trace changes back to the originating PR. Please append the PR number in the same style as adjacent entries.
See below for a potential fix:
- CI: smoke tests in `build-release.yml`'s `build-and-test` job (Linux x86_64, Linux arm64, Windows) are now gated to promotion boundaries (tag/schedule/dispatch) instead of running on every push to main. Push-time smoke duplicated the merge-time smoke gate in `ci-integration.yml` and burned ~15 redundant codex-binary downloads/day. Tag-cut releases still run smoke as a pre-ship gate; nightly catches upstream codex URL drift; merge-time still gates merges into main. (#869)
- CI docs: clarify that branch-protection ruleset must store the check-run name (`gate`), not the workflow display string (`Merge Gate / gate`); document the merge-gate aggregator in `cicd.instructions.md` and mark the legacy stub workflow as deprecated. (#869)
### Removed
- CI: deleted `ci-integration-pr-stub.yml`. The four stubs were a holdover from the pre-merge-gate model where branch protection required each Tier 2 check name directly. After #867, branch protection requires only `gate`, so the stubs are dead weight. Reduced `EXPECTED_CHECKS` in `merge-gate.yml` to just `Build & Test (Linux)`. (#869)
There was a problem hiding this comment.
Good catch — appended (#878) to the entry in 58c40e7. Thanks!
| to `.github/workflows/**`. | ||
| 4. **`build-release.yml`** - `push` to main, tags, schedule, `workflow_dispatch` | ||
| - **Linux + Windows** run combined `build-and-test` (unit tests + binary build in one job). | ||
| - **Linux + Windows** run combined `build-and-test` (unit tests + binary build in one job). Unit tests run on every push for platform-regression signal; **smoke tests are gated to tag/schedule/dispatch only** (promotion boundaries) to avoid duplicating `ci-integration.yml`'s merge-time smoke and to cut redundant codex-binary downloads. |
There was a problem hiding this comment.
This update changes the CI/CD behavior described here, but there are still other in-repo copies of these CI/CD instructions that will now drift (e.g., .apm/instructions/cicd.instructions.md and custom-instructions/repo/.github/instructions/cicd.instructions.md still state that build-and-test includes smoke on every push). Since .github/instructions/ is regenerated from .apm/ (per the repo changelog), consider updating the canonical .apm/ copy and re-running the regeneration so all instruction copies stay consistent.
There was a problem hiding this comment.
Right call. The .apm/ copy was stale by three PRs (#874, #875, this one). I synced .apm/instructions/cicd.instructions.md to the .github/ copy in 58c40e7 and verified apm install --target copilot regeneration produces identical .github/ content (no further drift). Thanks for flagging the systemic dogfooding issue.
Address PR #878 review: 1. Sync .apm/instructions/cicd.instructions.md (canonical source per #823) with .github/instructions/cicd.instructions.md so future apm install --target copilot regenerations don't revert the build-release smoke-gating doc note (and to bring along the stub-removal changes from #875 + branch-protection refinement from #874 that had also drifted). 2. Append (#878) suffix to the new CHANGELOG entry, matching the established Keep-a-Changelog convention used by neighbouring entries. No workflow behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Trims smoke tests in
build-release.yml'sbuild-and-testjob (Linux x86_64, Linux arm64, Windows) to promotion boundaries only (tags, schedule, workflow_dispatch). Push-time smoke duplicated the merge-time gate inci-integration.ymland burned ~15 redundant codex-binary downloads per active day.Pipeline before
flowchart TB PR[PR opened/updated] --> CI1[ci.yml<br/>Tier 1: unit only] PR --> MG[merge-gate.yml<br/>aggregate: gate] MQ[merge_group enqueued] --> CI2[ci.yml<br/>Tier 1: unit] MQ --> INT[ci-integration.yml<br/>Tier 2: BUILD then SMOKE Linux<br/>then INTEGRATION then RELEASE-VAL] PUSH[push to main] --> BR_PUSH[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform] PUSH -.skip.-> BR_TAG_GATED[integration / release-validation<br/>SKIPPED on push] TAG[tag v*] --> BR_TAG[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform] TAG --> BR_REL[integration + release-validation<br/>+ macOS Intel/ARM] CRON[nightly 04:00 UTC] --> BR_CRON[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform] CRON2[nightly 05:00 UTC] --> RT[ci-runtime.yml<br/>SMOKE + live inference Linux] style BR_PUSH fill:#fdd,stroke:#c00 style BR_CRON fill:#ffe,stroke:#a90 style INT fill:#dfd,stroke:#0a0 style BR_TAG fill:#dfd,stroke:#0a0 style RT fill:#dfd,stroke:#0a0Red = redundant smoke (~15 codex downloads/day on every push to main, defending nothing that
ci-integration.ymldidn't already gate at merge time).Pipeline after
flowchart TB PR[PR opened/updated] --> CI1[ci.yml<br/>Tier 1: unit only] PR --> MG[merge-gate.yml<br/>aggregate: gate] MQ[merge_group enqueued] --> CI2[ci.yml<br/>Tier 1: unit] MQ --> INT[ci-integration.yml<br/>Tier 2: BUILD then SMOKE Linux<br/>then INTEGRATION then RELEASE-VAL] PUSH[push to main] --> BR_PUSH[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT only<br/>smoke step skipped via if condition] TAG[tag v*] --> BR_TAG[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform] TAG --> BR_REL[integration + release-validation<br/>+ macOS Intel/ARM] CRON[nightly 04:00 UTC] --> BR_CRON[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform] CRON2[nightly 05:00 UTC] --> RT[ci-runtime.yml<br/>SMOKE + live inference Linux] style BR_PUSH fill:#dfd,stroke:#0a0 style INT fill:#dfd,stroke:#0a0 style BR_TAG fill:#dfd,stroke:#0a0 style BR_CRON fill:#dfd,stroke:#0a0 style RT fill:#dfd,stroke:#0a0Smoke gating, by trigger
flowchart LR subgraph "Promotion boundaries (smoke runs)" T[tag v*] S[schedule] D[workflow_dispatch] end subgraph "Day-to-day (smoke skipped)" P[push to main] end T --> COND{"if: github.ref_type == 'tag'<br/>or github.event_name == 'schedule'<br/>or github.event_name == 'workflow_dispatch'"} S --> COND D --> COND P --> COND COND -->|true| RUN[Run smoke tests step executes<br/>downloads codex binary<br/>tests scripts/runtime/setup-codex.sh] COND -->|false| SKIP[Smoke step skipped<br/>only Run unit tests step executes] style RUN fill:#dfd,stroke:#0a0 style SKIP fill:#eef,stroke:#88aCoverage matrix (after this PR)
pull_requestci.ymlmerge_groupci-integration.ymlbuild-release.ymlv*build-release.ymlbuild-release.ymlworkflow_dispatchbuild-release.ymlci-runtime.ymlTrade-off accepted
Multi-platform smoke (Linux arm64 + Windows) shifts from "every push to main" to "every tag cut". Time-to-detection window for platform-specific regressions in
scripts/runtime/setup-codex.shwidens from hours to ~release cadence, in exchange for a meaningful reduction in network-flake exposure and CI noise. Linux x86_64 smoke still runs at merge time on every PR viaci-integration.yml.Verification plan
This PR runs through the standard PR-time path (
ci.ymlTier 1 unit-only +merge-gate.yml), neither of which exercises the trimmed step.To validate the new behavior end-to-end:
build-and-testshould run unit tests only - no "Run smoke tests" step in any of the 3 platform matrix entries.build-and-testshould show both "Run unit tests" AND "Run smoke tests" steps on all 3 platforms.workflow_dispatchfrom main after this merges to confirm the smoke step lights up.The conditional expression
github.ref_type == 'tag' || github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'is the same canonical pattern already used in this same workflow by:build-and-validate-macos-intelintegration + release-validation phases (lines 179, 191, 206)build-and-validate-macos-armjob-level gate (line 225)integration-testsjob (line 321)Release Validationjob (line 397)So the gating semantics are a copy of what is already proven in this same workflow.
Review feedback addressed
(#878)suffix: applied in 58c40e7..apm/canonical drift: synced.apm/instructions/cicd.instructions.mdto match.github/instructions/cicd.instructions.mdso futureapm install --target copilotregenerations don't revert this change. Verified regeneration produces identical.github/content.Out of scope
The flaky env-var leak from
tests/unit/test_ssl_cert_hook.py(separate root cause) is a follow-up. This PR reduces the flake-exposure surface by ~70% on push events; the underlying leak should still be fixed.