Skip to content

feat: surface BuildKit root-cause on cancellation#574

Draft
gilescope wants to merge 1 commit into
mainfrom
giles-failure-diagnostics
Draft

feat: surface BuildKit root-cause on cancellation#574
gilescope wants to merge 1 commit into
mainfrom
giles-failure-diagnostics

Conversation

@gilescope

Copy link
Copy Markdown

Depends on #572 (CI memory telemetry) — land this after #572 merges. It calls statsstreamparser.Parser.Reset(), introduced there. This branch bundles parser.go/parser_test.go so it builds standalone; once #572 is in main, those two files drop out as identical (no conflict).

Extracted from #442 (the buildkit upgrade) so the failure-visibility work can land independently of the bump.

What

Surfaces the real root cause of a failed build instead of a bare context canceled. When BuildKit cancels or loses the solve session after a vertex has already failed, earth now reports the original failing target/command and BuildKit error, and distinguishes client-side from daemon-side cancellation.

  • logbus/solvermon/first_failure.go — captures the first fatal BuildKit vertex failure (scrubbed), preserved across the cancellation fan-out.
  • logbus/solvermon/{solvermon,vertexmon}.go — record per-vertex failures + logs; reset the stats parser on a desynced stream.
  • cmd/earthly/app/run.goprintCancellationOrigin reports whether cancellation began locally (signal / dead build context) or in BuildKit/the session layer; wires AsFirstFailureError into the fatal-error path.
  • builder/solver.gowithBuildkitFailureContext attaches target/command context to the returned error.

Deliberately excluded (stays in #442)

The buildkit-API entitlements change in builder/solver.go/image_solver.go (AllowedEntitlements: s.enttlmntsentitlementsToStrings(...) []string) — won't compile against current buildkit.

Verification

  • go build ./... — green.
  • go test ./logbus/solvermon/ ./builder/ — green.

Signed-off-by: Giles Cope <gilescope@gmail.com>
@github-actions

Copy link
Copy Markdown

➖ Are we earthbuild yet?

No change in "earthly" occurrences

📈 Overall Progress

Branch Total Count
main 5320
This PR 5320
Difference +0

Keep up the great work migrating from Earthly to Earthbuild! 🚀

💡 Tips for finding more occurrences

Run locally to see detailed breakdown:

./.github/scripts/count-earthly.sh

Note that the goal is not to reach 0.
There is anticipated to be at least some occurences of earthly in the source code due to backwards compatibility with config files and language constructs.

@gilescope gilescope added the ai-assisted Authored with AI assistance label Jun 15, 2026
@gilescope gilescope mentioned this pull request Jun 15, 2026
10 tasks

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances error reporting and cancellation handling in Earthly by tracking the first fatal vertex failure, first cancellation, and active operations during a build. It introduces detailed error types (FirstFailureError, FirstCancellationError, and CancellationDetailsError) and reports the origin of cancellations. Additionally, it improves the robustness of the stats stream parser by allowing it to recover from malformed frames instead of failing the build. The review feedback suggests avoiding if statements with initializers in builder/solver.go to comply with Go linting preferences.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread builder/solver.go
Comment thread builder/solver.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-assisted Authored with AI assistance better-errors

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants