ci(main): run Phase A (local gating) in parallel with Phase B (staging deploy)#191
Draft
deucalioncodes wants to merge 1 commit into
Draft
ci(main): run Phase A (local gating) in parallel with Phase B (staging deploy)#191deucalioncodes wants to merge 1 commit into
deucalioncodes wants to merge 1 commit into
Conversation
…g deploy) Phase B used to wait for layered-e2e-local to finish. The two phases test different things — Phase A is correctness against an ephemeral local replica, Phase B is the real staging deploy — so making them sequential added ~10 min of pure wall-clock with no signal benefit. The 'concurrency: ci-main' group still serializes main pushes so two staging deploys never race. Co-authored-by: Jose Perez <deucalioncodes@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Drop the
needs: layered-e2e-localonbootstrap-infra-stagingso Phase A and Phase B run in parallel.Why
The recent ~34m22s
ci-mainrun breaks down as:layered-e2e-local(Phase A)bootstrap-infra-stagingpublish-artifacts-staginginstall-mundus-stagingverify-mundus-staging(parse + e2e)Phase A and Phase B test different things — correctness against an ephemeral local replica vs. real staging deploy — and currently waiting for Phase A before kicking off Phase B costs ~10 minutes of pure wall-clock with no signal benefit. If Phase A fails the commit is still bad; the staging deploy will either succeed (in which case the bug isn't deploy-time) or surface its own failure independently.
Expected wall-clock after this change: ~24m (max of Phase A's 10m and Phase B's ~24m), compared to ~34m today. Phase B itself is unchanged.
The
concurrency: ci-maingroup is unchanged so two staging deploys still never race each other.Notes on rejected ideas
While analyzing this I considered four other optimizations and dropped them after discussion:
_install-mundus.yml. All install messages go through the singlerealm_installercanister, so WASM installs serialize there anyway. Only the canister→canister extension installs would actually parallelize — net win ~4–5 min, not worth the YAML complexity for now._bootstrap-infra.ymlwhen all infra ids are pinned. The staging descriptor already pins all three, so stage 0 is metadata-only — but the conditional adds enough complexity that ~35s isn't worth it.cache: pip/cache: npmon the reusable workflows. Explicitly rejected — caches have been a source of flakes in the past.Risk
Low. The change only removes a
needs:edge in the job graph; both phases already work standalone (e.g.layered-deploy-dominion.ymlruns Phase B without Phase A).