Skip to content

ci(build-release): gate smoke to tag/schedule/dispatch only#878

Merged
danielmeppiel merged 2 commits intomainfrom
ci/build-release-smoke-trim
Apr 23, 2026
Merged

ci(build-release): gate smoke to tag/schedule/dispatch only#878
danielmeppiel merged 2 commits intomainfrom
ci/build-release-smoke-trim

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

@danielmeppiel danielmeppiel commented Apr 23, 2026

Summary

Trims smoke tests in build-release.yml's build-and-test job (Linux x86_64, Linux arm64, Windows) to promotion boundaries only (tags, schedule, workflow_dispatch). Push-time smoke duplicated the merge-time gate in ci-integration.yml and burned ~15 redundant codex-binary downloads per active day.

Pipeline before

flowchart TB
    PR[PR opened/updated] --> CI1[ci.yml<br/>Tier 1: unit only]
    PR --> MG[merge-gate.yml<br/>aggregate: gate]

    MQ[merge_group enqueued] --> CI2[ci.yml<br/>Tier 1: unit]
    MQ --> INT[ci-integration.yml<br/>Tier 2: BUILD then SMOKE Linux<br/>then INTEGRATION then RELEASE-VAL]

    PUSH[push to main] --> BR_PUSH[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform]
    PUSH -.skip.-> BR_TAG_GATED[integration / release-validation<br/>SKIPPED on push]

    TAG[tag v*] --> BR_TAG[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform]
    TAG --> BR_REL[integration + release-validation<br/>+ macOS Intel/ARM]

    CRON[nightly 04:00 UTC] --> BR_CRON[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform]
    CRON2[nightly 05:00 UTC] --> RT[ci-runtime.yml<br/>SMOKE + live inference Linux]

    style BR_PUSH fill:#fdd,stroke:#c00
    style BR_CRON fill:#ffe,stroke:#a90
    style INT fill:#dfd,stroke:#0a0
    style BR_TAG fill:#dfd,stroke:#0a0
    style RT fill:#dfd,stroke:#0a0
Loading

Red = redundant smoke (~15 codex downloads/day on every push to main, defending nothing that ci-integration.yml didn't already gate at merge time).

Pipeline after

flowchart TB
    PR[PR opened/updated] --> CI1[ci.yml<br/>Tier 1: unit only]
    PR --> MG[merge-gate.yml<br/>aggregate: gate]

    MQ[merge_group enqueued] --> CI2[ci.yml<br/>Tier 1: unit]
    MQ --> INT[ci-integration.yml<br/>Tier 2: BUILD then SMOKE Linux<br/>then INTEGRATION then RELEASE-VAL]

    PUSH[push to main] --> BR_PUSH[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT only<br/>smoke step skipped via if condition]

    TAG[tag v*] --> BR_TAG[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform]
    TAG --> BR_REL[integration + release-validation<br/>+ macOS Intel/ARM]

    CRON[nightly 04:00 UTC] --> BR_CRON[build-release.yml<br/>build-and-test x3 platforms<br/>UNIT + SMOKE per platform]
    CRON2[nightly 05:00 UTC] --> RT[ci-runtime.yml<br/>SMOKE + live inference Linux]

    style BR_PUSH fill:#dfd,stroke:#0a0
    style INT fill:#dfd,stroke:#0a0
    style BR_TAG fill:#dfd,stroke:#0a0
    style BR_CRON fill:#dfd,stroke:#0a0
    style RT fill:#dfd,stroke:#0a0
Loading

Smoke gating, by trigger

flowchart LR
    subgraph "Promotion boundaries (smoke runs)"
        T[tag v*]
        S[schedule]
        D[workflow_dispatch]
    end
    subgraph "Day-to-day (smoke skipped)"
        P[push to main]
    end

    T --> COND{"if: github.ref_type == 'tag'<br/>or github.event_name == 'schedule'<br/>or github.event_name == 'workflow_dispatch'"}
    S --> COND
    D --> COND
    P --> COND
    COND -->|true| RUN[Run smoke tests step executes<br/>downloads codex binary<br/>tests scripts/runtime/setup-codex.sh]
    COND -->|false| SKIP[Smoke step skipped<br/>only Run unit tests step executes]

    style RUN fill:#dfd,stroke:#0a0
    style SKIP fill:#eef,stroke:#88a
Loading

Coverage matrix (after this PR)

Trigger Workflow Smoke runs? Platforms Why
pull_request ci.yml No Linux Fast PR feedback; unchanged
merge_group ci-integration.yml Yes Linux Load-bearing merge gate; unchanged
push to main build-release.yml No (was yes) (skipped) Was redundant with merge-time gate
tag v* build-release.yml Yes Linux x86_64, Linux arm64, Windows Pre-ship gate for tag-cut releases
schedule build-release.yml Yes Linux x86_64, Linux arm64, Windows Multi-platform drift canary
workflow_dispatch build-release.yml Yes Linux x86_64, Linux arm64, Windows Manual safety net
schedule ci-runtime.yml Yes Linux x86_64 Live inference + smoke; unchanged

Trade-off accepted

Multi-platform smoke (Linux arm64 + Windows) shifts from "every push to main" to "every tag cut". Time-to-detection window for platform-specific regressions in scripts/runtime/setup-codex.sh widens from hours to ~release cadence, in exchange for a meaningful reduction in network-flake exposure and CI noise. Linux x86_64 smoke still runs at merge time on every PR via ci-integration.yml.

Verification plan

This PR runs through the standard PR-time path (ci.yml Tier 1 unit-only + merge-gate.yml), neither of which exercises the trimmed step.

To validate the new behavior end-to-end:

  1. Push-time path (the one being changed): observable on next push to main after merge. build-and-test should run unit tests only - no "Run smoke tests" step in any of the 3 platform matrix entries.
  2. Tag path (must continue to work): observable on next tagged release. build-and-test should show both "Run unit tests" AND "Run smoke tests" steps on all 3 platforms.
  3. Schedule path: observable on next nightly run (04:00 UTC). Same as tag path.
  4. Dispatch path: I will trigger a test workflow_dispatch from main after this merges to confirm the smoke step lights up.

The conditional expression github.ref_type == 'tag' || github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' is the same canonical pattern already used in this same workflow by:

  • build-and-validate-macos-intel integration + release-validation phases (lines 179, 191, 206)
  • build-and-validate-macos-arm job-level gate (line 225)
  • integration-tests job (line 321)
  • Release Validation job (line 397)

So the gating semantics are a copy of what is already proven in this same workflow.

Review feedback addressed

  • CHANGELOG (#878) suffix: applied in 58c40e7.
  • .apm/ canonical drift: synced .apm/instructions/cicd.instructions.md to match .github/instructions/cicd.instructions.md so future apm install --target copilot regenerations don't revert this change. Verified regeneration produces identical .github/ content.

Out of scope

The flaky env-var leak from tests/unit/test_ssl_cert_hook.py (separate root cause) is a follow-up. This PR reduces the flake-exposure surface by ~70% on push events; the underlying leak should still be fixed.

Push-time smoke in build-release.yml's build-and-test job (Linux x86_64,
Linux arm64, Windows) duplicated the merge-time smoke gate already enforced
by ci-integration.yml on the same SHA content, while burning ~15 redundant
codex-binary downloads per active day and amplifying network-flake exposure.

Smoke now runs only at promotion boundaries:

  - tags (pre-ship release gate; only validation tag-cut releases receive)
  - schedule (nightly drift catch for upstream openai/codex URL changes)
  - workflow_dispatch (manual safety net)

Push-to-main retains unit tests on all build-and-test platforms for
platform-regression signal; smoke coverage on Linux at merge_group time
(ci-integration.yml) and on Linux x86_64 nightly (ci-runtime.yml) is
unchanged. Multi-platform smoke (arm64 + Windows) shifts from per-push
to per-tag, narrowing the time-to-detection window for platform-specific
regressions in scripts/runtime/setup-codex.sh by hours-to-days but
trading that for a meaningful reduction in network noise.

The gating expression matches the existing canonical pattern used by the
macOS Intel/ARM jobs and integration-tests/release-validation jobs in
this same workflow.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 23, 2026 13:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts the release pipeline to reduce redundant CI work by running runtime smoke tests only at promotion boundaries, while keeping push-to-main unit test coverage for platform regressions.

Changes:

  • Gate build-release.yml build-and-test smoke tests to tag/schedule/workflow_dispatch events only.
  • Split the prior combined test invocation into always-on unit tests plus conditionally-run smoke tests.
  • Document the new behavior in CI/CD instructions and add a changelog entry.
Show a summary per file
File Description
CHANGELOG.md Adds an Unreleased entry describing the smoke-test gating change.
.github/workflows/build-release.yml Moves smoke tests behind a promotion-boundary conditional; keeps unit tests on push.
.github/instructions/cicd.instructions.md Updates CI/CD workflow documentation to reflect the new smoke-test gating.

Copilot's findings

  • Files reviewed: 3/3 changed files
  • Comments generated: 2

Comment thread CHANGELOG.md Outdated

### Changed

- CI: smoke tests in `build-release.yml`'s `build-and-test` job (Linux x86_64, Linux arm64, Windows) are now gated to promotion boundaries (tag/schedule/dispatch) instead of running on every push to main. Push-time smoke duplicated the merge-time smoke gate in `ci-integration.yml` and burned ~15 redundant codex-binary downloads/day. Tag-cut releases still run smoke as a pre-ship gate; nightly catches upstream codex URL drift; merge-time still gates merges into main.
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog entries in this repo appear to require a PR reference at the end (e.g., "(#123)"). This new Unreleased entry does not include a PR number, which will break the established changelog format and makes it harder to trace changes back to the originating PR. Please append the PR number in the same style as adjacent entries.

See below for a potential fix:

- CI: smoke tests in `build-release.yml`'s `build-and-test` job (Linux x86_64, Linux arm64, Windows) are now gated to promotion boundaries (tag/schedule/dispatch) instead of running on every push to main. Push-time smoke duplicated the merge-time smoke gate in `ci-integration.yml` and burned ~15 redundant codex-binary downloads/day. Tag-cut releases still run smoke as a pre-ship gate; nightly catches upstream codex URL drift; merge-time still gates merges into main. (#869)
- CI docs: clarify that branch-protection ruleset must store the check-run name (`gate`), not the workflow display string (`Merge Gate / gate`); document the merge-gate aggregator in `cicd.instructions.md` and mark the legacy stub workflow as deprecated. (#869)

### Removed

- CI: deleted `ci-integration-pr-stub.yml`. The four stubs were a holdover from the pre-merge-gate model where branch protection required each Tier 2 check name directly. After #867, branch protection requires only `gate`, so the stubs are dead weight. Reduced `EXPECTED_CHECKS` in `merge-gate.yml` to just `Build & Test (Linux)`. (#869)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — appended (#878) to the entry in 58c40e7. Thanks!

to `.github/workflows/**`.
4. **`build-release.yml`** - `push` to main, tags, schedule, `workflow_dispatch`
- **Linux + Windows** run combined `build-and-test` (unit tests + binary build in one job).
- **Linux + Windows** run combined `build-and-test` (unit tests + binary build in one job). Unit tests run on every push for platform-regression signal; **smoke tests are gated to tag/schedule/dispatch only** (promotion boundaries) to avoid duplicating `ci-integration.yml`'s merge-time smoke and to cut redundant codex-binary downloads.
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This update changes the CI/CD behavior described here, but there are still other in-repo copies of these CI/CD instructions that will now drift (e.g., .apm/instructions/cicd.instructions.md and custom-instructions/repo/.github/instructions/cicd.instructions.md still state that build-and-test includes smoke on every push). Since .github/instructions/ is regenerated from .apm/ (per the repo changelog), consider updating the canonical .apm/ copy and re-running the regeneration so all instruction copies stay consistent.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right call. The .apm/ copy was stale by three PRs (#874, #875, this one). I synced .apm/instructions/cicd.instructions.md to the .github/ copy in 58c40e7 and verified apm install --target copilot regeneration produces identical .github/ content (no further drift). Thanks for flagging the systemic dogfooding issue.

Address PR #878 review:

1. Sync .apm/instructions/cicd.instructions.md (canonical source per
   #823) with .github/instructions/cicd.instructions.md so future
   apm install --target copilot regenerations don't revert the
   build-release smoke-gating doc note (and to bring along the
   stub-removal changes from #875 + branch-protection refinement
   from #874 that had also drifted).

2. Append (#878) suffix to the new CHANGELOG entry, matching the
   established Keep-a-Changelog convention used by neighbouring
   entries.

No workflow behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel danielmeppiel merged commit 84d2363 into main Apr 23, 2026
24 checks passed
@danielmeppiel danielmeppiel deleted the ci/build-release-smoke-trim branch April 23, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants