ci: publish multi-arch (amd64+arm64) Docker images via native runners#39
Merged
Conversation
The publish-runners workflow previously built and pushed amd64-only images
because docker/setup-qemu-action, docker/setup-buildx-action, and the
platforms: key were all absent from every docker/build-push-action step.
Apple Silicon users on macOS pulled the amd64 image and ran it under
Rosetta 2 / QEMU emulation.
Refactor to a build+merge pattern using native runners (ubuntu-latest for
amd64, ubuntu-24.04-arm for arm64) — avoids QEMU's 3-5x slowdown on the
backend images' npm install steps. 5-job graph:
build-base (2 jobs: per-arch builds of helix-evo-runner-base)
|
v
merge-base (1 job: combines 2 arch digests into manifest list)
|
v
build-backends (10 jobs: 5 backends x 2 arches; each builds against
the merged multi-arch base)
|
v
merge-backends (5 jobs: one per backend; combines its 2 arch digests
into a multi-arch manifest list)
|
v
verify (smoke-tests pulled images, unchanged)
Each build job pushes by digest (push-by-digest=true, name-canonical=true)
and uploads the digest as an artifact. Each merge job downloads the
relevant digests and runs `docker buildx imagetools create` to combine them
into a multi-arch manifest list at the canonical tag (latest / version /
ref) computed by docker/metadata-action@v5.
Verification (after the next v* tag triggers this workflow):
docker buildx imagetools inspect ghcr.io/ke7/helix-evo-runner-base:latest
# Must show MediaType: application/vnd.oci.image.index.v1+json
# plus TWO Platform: entries (linux/amd64, linux/arm64).
docker pull ghcr.io/ke7/helix-evo-runner-claude:latest
docker run --rm ghcr.io/ke7/helix-evo-runner-claude:latest uname -m
# On Apple Silicon: aarch64 (was: x86_64).
Reviewer (sibling agent) approved all 16 review criteria. Polish S1
applied (removed extraneous actions/checkout@v6 from merge-base and
merge-backends — no repo files used there). Polish S2 (adding OCI
labels to per-arch builds via metadata-action duplication) deferred
as cosmetic.
Refs: helix-arm64-image-debug.md, helix-multiarch-publish-review.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
publish-runnersworkflow previously built and pushed amd64-only images becausedocker/setup-qemu-action,docker/setup-buildx-action, and theplatforms:key were all absent from everydocker/build-push-actionstep. Apple Silicon users on macOS pull the amd64 image and run it under Rosetta 2 / QEMU emulation.This PR refactors the workflow to publish true multi-arch (linux/amd64 + linux/arm64) manifest lists using a native-runner build+merge pattern. Native runners (
ubuntu-latestfor amd64,ubuntu-24.04-armfor arm64) avoid the ~3-5× QEMU slowdown that would otherwise plague the backend images' npm install steps.Job graph
Verification after release tag
After the next
v*tag triggers this workflow:Reviewer pass (16 criteria, all PASS)
on:trigger unchanged,env:block preserved, permissions preservedneeds:chain correct,ubuntu-24.04-armrunner name correctbuild-backends2D matrix (image × platform + includefor runner pinning) expands to exactly 10 jobspush-by-digest=true,name-canonical=true,push=trueon build steps (notags:key on build)docker buildx imagetools createjq invocation correct; tag scheme preserved (latest/ref/tag/semver)verifyjob preserved + retargeted atmerge-backendsfail-fast: falseon both build matricesload: trueS1 polish applied (removed unused
actions/checkout@v6frommerge-baseandmerge-backends— saves ~10-15s per merge run). S2 deferred (adding OCI labels via metadata-action in build jobs is cosmetic; images are functional without).Root-cause report
See
/Users/ke/helix-arm64-image-debug.md§6 for the design rationale (native-arm64-runner vs QEMU emulation trade-off) and full diagnostic evidence (manifest inspect output, config blob arch=amd64 confirmation).