update to latest buildkit#442
Conversation
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
|
| Branch | Total Count |
|---|---|
| main | 5346 |
| This PR | 5532 |
| Difference | +186 (3.48%) |
📁 Changes by file type:
| File Type | Change |
|---|---|
| Go files (.go) | ❌ +6 |
| Documentation (.md) | ❌ +14 |
| Earthfiles | ➖ No change |
Keep up the great work migrating from Earthly to Earthbuild! 🚀
💡 Tips for finding more occurrences
Run locally to see detailed breakdown:
./.github/scripts/count-earthly.shNote that the goal is not to reach 0.
There is anticipated to be at least some occurences of earthly in the source code due to backwards compatibility with config files and language constructs.
There was a problem hiding this comment.
Code Review
This pull request updates repository references and URLs from the 'earthly' organization to 'EarthBuild' across documentation and build configurations. It also performs a significant update of Go dependencies, including upgrading gRPC to v1.80.0 and updating various containerd and Docker-related packages. A bug was identified in the buildkitd/Earthfile where a log message references an undefined variable ${BUILDKIT_BRANCH} instead of ${BUILDKIT_GIT_BRANCH}.
| echo "looking up branch $BUILDKIT_GIT_BRANCH"; \ | ||
| buildkit_sha1=$(git ls-remote --refs -q https://github.com/$BUILDKIT_GIT_ORG/buildkit.git "$BUILDKIT_GIT_BRANCH" | awk 'BEGIN { FS = "[ \t]+" } {print $1}'); \ | ||
| echo "pinning github.com/earthly/buildkit@${BUILDKIT_BRANCH} to reference git sha1: $buildkit_sha1"; \ | ||
| echo "pinning github.com/${BUILDKIT_GIT_ORG}/buildkit@${BUILDKIT_BRANCH} to reference git sha1: $buildkit_sha1"; \ |
There was a problem hiding this comment.
The variable ${BUILDKIT_BRANCH} is used in this log message, but the argument defined in this scope is BUILDKIT_GIT_BRANCH. This will result in an empty string being printed for the branch name in the logs. Since you are already modifying this line to parameterize the organization, you should also correct the branch variable name.
echo "pinning github.com/${BUILDKIT_GIT_ORG}/buildkit@${BUILDKIT_GIT_BRANCH} to reference git sha1: $buildkit_sha1"; \
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Alpine 3.22 moved iptables from /sbin to /usr/sbin. Signed-off-by: Giles Cope <gilescope@gmail.com>
- Update buildkit fork ref to f4ec24bc (includes GRPC_ENFORCE_ALPN_ENABLED=false) - Disable gRPC ALPN enforcement in earthly client and buildkitd entrypoint for backwards compat with older grpc-go during upgrade transition - Search /usr/sbin in addition to /sbin for iptables (Alpine 3.22 change) - Bump all CI EARTHLY_BUILDKIT_IMAGE refs to v0.8.17-fix.4 Signed-off-by: Giles Cope <gilescope@gmail.com>
990ef27 to
c42959c
Compare
Older earth binaries pass EARTHLY_ADDITIONAL_BUILDKIT_CONFIG with the TOML section header and key on the same line (e.g. [registry."docker.io"] mirrors = [...]). The new buildkit's TOML parser requires a newline after section headers. Post-process the generated buildkitd.toml to split these. Signed-off-by: Giles Cope <gilescope@gmail.com>
Upstream buildkit added a strict verifier that rejects multiple refs without platform mapping. Earthly's multi-BUILD pattern legitimately produces this. The buildkit fork now downgrades this to a warning. Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
|
21 green CI jobs! It's a start. |
Signed-off-by: Giles Cope <gilescope@gmail.com>
Three concurrent Go compilations (buildkitd, ticktock-buildkitd, earthly) exceed the 16GB runner memory under Podman, causing OOM kills that manifest as silent cancellations. Signed-off-by: Giles Cope <gilescope@gmail.com>
The old SHA 88ecf5d6 is incompatible with the current codebase: missing client/llb/sourceresolver package and containerd API version conflicts. Point to the same buildkit fork commit (da92d3419) used by the main build. Signed-off-by: Giles Cope <gilescope@gmail.com>
Three concurrent Go compilations (buildkitd, buildctl, earthly) exceed runner memory under Podman. Limiting max-parallelism to 2 serialises the heaviest compilation steps, keeping peak memory within bounds. Signed-off-by: Giles Cope <gilescope@gmail.com>
The build-earthly parallelism fix only applied to the build step. Test jobs bootstrap their own buildkitd via stage2-setup, so they also need the parallelism limit to avoid OOM during Go compilations. Signed-off-by: Giles Cope <gilescope@gmail.com>
The earthly-next build is even heavier than normal (update-buildkit + two buildkitd variants + earthly). Apply max-parallelism=2 to prevent OOM on standard CI runners. Signed-off-by: Giles Cope <gilescope@gmail.com>
Move GCR mirror config and max-parallelism before bootstrap so buildkitd starts with the correct settings first time, avoiding a restart that may not pick up max-parallelism correctly. Signed-off-by: Giles Cope <gilescope@gmail.com>
Docker earthly-next tests also OOM with default parallelism of 20. Apply the limit for all CI builds, not just Podman/earthly-next. Signed-off-by: Giles Cope <gilescope@gmail.com>
…uildkit # Conflicts: # go.mod # go.sum
builder/solver.go ran two errgroup goroutines: bkClient.Build and MonitorProgress. MonitorProgress also returns errors from earth's own status processing (e.g. bp.NewCommand). When it aborts, it cancels the shared errgroup context, so bkClient.Build then returns a bare 'context canceled' — and the old code preferred that buildErr, discarding the real monitor error in eg.Wait()'s result. An earth-side self-cancellation was thus misreported as 'BuildKit lost the session'. chooseSolveError now prefers a non-cancellation monitor error over a canceled build error. Doubles as a probe: a class-3 failure caused this way will now name its real cause instead of the cancellation veil. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
not-a-unit-test.sh ran 'go test' with the default -p (GOMAXPROCS = host CPUs), compiling and linking many test binaries at once. Nested inside an earthly build on a 4-core/16G CI runner, that RSS spike is what tips the box into memory pressure; the kill then cascades as a lost solve session at this exact vertex. Cap -p to 2 (override via GO_TEST_PARALLELISM) to flatten the peak. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…uildkit Adopts main's base->go/node Earthfile refactor (FROM golang:1.26.4-alpine3.24 eliminates the Go-tarball wget that was flake class #1), keeping the EarthBuild/buildkit diagnostics pin (79762ff4c), the WITH-DOCKER nested test wrapper plus EARTHLY_SKIP_BUILDKIT_CLI_TESTS, and the Docker Hub login continue-on-error. Action versions taken from main. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Earthfile merge resolver dropped the newline between 'RUN apk add ... git' and 'ENV EARTHLY_IMAGE=true', joining them onto one line. That swallowed the ENV into the RUN command and corrupted +earthly-docker, cascading 'requires a FROM' failures through every target that builds the inner earthly via +earthly-integration-test-base. Split the lines back (keeping main's --no-cache apk flag). The IF-before-FROM in the integration base is valid and unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
main's base->go refactor removed the file-level 'FROM alpine' base recipe. That base set RanFromLike=true for every target, which the interpreter's checkAllowed guard requires before an IF (the condition runs in a shell). Two targets rely on an IF before their first FROM to choose a buildkit image conditionally: +earthly-docker and +earthly-integration-test-base. Without the implicit base they now fail 'requires a FROM', cascading through every test that builds the inner earthly. Add a scaffold 'FROM alpine:3.24.0' to each (replaced by the in-IF FROM), mirroring buildkitd/Earthfile which kept its file-level FROM. Verified against converter.go checkAllowed: a prior FROM sets RanFromLike and unblocks the IF. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolves go.mod fsutil require (the active version is the EarthBuild fsutil replace directive regardless). Carries the IF-before-FROM scaffold fix so the A+B probe can finally run clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ons)
Root cause of the recurring 'BuildKit canceled or lost the solve session'
failures, finally surfaced by the errgroup attribution fix
(chooseSolveError): the build aborts with
earth progress monitor aborted the build: failed decoding stats
stream: unexpected stats stream protocol version 123
123 is 0x7B, '{': the daemon's runc stats collector hits EOF (the
recurring 'runc stats collection error: EOF' in buildkitd logs) and
emits a raw/partial frame where the versioned framing
([0x01][uint32 len][JSON]) is expected. vertexMonitor.Write returned
that decode error, which propagates through MonitorProgress, cancels the
errgroup, kills the running exec (exit 137), and reports a bogus lost
session.
Stats are diagnostic telemetry and must never abort a build. Drop the
bad batch and re-sync (new Parser.Reset) instead. Red test reproduces a
raw '{' frame and the recovery.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
build-earthly bootstraps with the released earth (v0.8.17), which lacks the stats-stream non-fatal fix, so it can still hit a class-3 'Canceled' when driving the fork's buildkitd. It also occasionally hits a transient exit-126 exec failure in the go build. With only 2 attempts a run can exhaust both on different flakes (seen on 00bd4b0: attempt 1 Canceled, attempt 2 exit 126), skipping the whole downstream suite. A third attempt makes the bootstrap absorb these non-deterministic failures. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
main bumped the moby/buildkit *require* to v0.30.0 (#563); kept that line but preserved our replace directive pinning the EarthBuild diagnostics fork (79762ff4c, the stats/cancellation work this branch depends on) and the docker-image-spec require. go.sum regenerated via go mod tidy. Also brings docker/cli v29.5.3, alpine 3.24.0, npm 11.17.0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
… hooks Signed-off-by: Giles Cope <gilescope@gmail.com>
Makes ubuntu-latest CI reliably green by fixing the actual flake sources, and points buildkitd at a diagnostics-enabled BuildKit fork revision so the next unknown failure names its root cause instead of
context canceled.Companion fork PR: EarthBuild/buildkit#14
Merge these PRs first to reduce diff size
Each is an independent, self-contained extraction that lands on current
mainwithout the bump:util/stringutilCaser (+-raceregression test)VERSIONparsing → Go tests; drop superseded integration testsstatsstreamparser.Reset)Once the above land, this PR is reduced to the genuine bump:
go.mod/go.sum(buildkit/containerd v2/grpc 1.80),buildkitd, the BuildKit-API adaptations (entitlements, protobuf getters, ALPN), and the OOM/retry CI tuning.Flake classes fixed
+basefetched the Go toolchain with a barewget;dl.google.comdrops connections mid-transfer,+basedies, and every dependent target reportscontext canceled— which made BuildKit look guilty. Downloads now resume (wget -c) and retry with backoff; GNU curl sites get--retry --retry-all-errors. Same treatment for zig (ziglang.org throttles CI), gh, kind, antlr, golangci-lint. (Split out as ci: retry/back off toolchain downloads #577.)cases.Caserinutil/stringutil(x/text Casers are stateful, not goroutine-safe) panicked-racejobs withslice bounds out of range. Now constructed per call; concurrent regression test added. (Split out as fix: data race squashed #567.)85c7359preserves first non-cancellation root causes across cancellation fan-out (exec/cache/gateway/session paths), so genuine BuildKit failures surface with target/command context. Workflow retry harness restarts buildkitd between attempts. (earth-side diagnostics split out as feat: surface BuildKit root-cause on cancellation #574.)Also in this branch
Verification
go test -race ./util/stringutil/red before the caser fix, green after (×3 runs).