Skip to content

feat: process-memory + buildkitd OTEL telemetry for CI memory diagnos…#572

Open
gilescope wants to merge 1 commit into
mainfrom
giles-ci-telemetry
Open

feat: process-memory + buildkitd OTEL telemetry for CI memory diagnos…#572
gilescope wants to merge 1 commit into
mainfrom
giles-ci-telemetry

Conversation

@gilescope

@gilescope gilescope commented Jun 14, 2026

Copy link
Copy Markdown

Extracted from #442 (the buildkit upgrade) so the CI memory-diagnostics telemetry can land independently of the bump.

What

Adds OpenTelemetry instrumentation to help diagnose CI memory pressure (the OOM-driven flakiness that #442 fights), plus a stats-stream recovery fix:

  • Process-memory metrics (internal/telemetry) — registers OTEL gauges for process memory (RSS/heap), tagged with process role/nesting/target so inner vs outer earth processes are distinguishable. Wired into the existing telemetry Start path.
  • buildkitd OTEL env plumbing (buildkitd) — addBuildkitTelemetryEnv forwards OTEL_* exporter config into the buildkitd container, sets OTEL_SERVICE_NAME=EarthBuild-buildkitd, and appends resource attributes (role, nesting, container name, installation). appendOTELResourceAttributes merges with any caller-supplied OTEL_RESOURCE_ATTRIBUTES, dropping malformed entries. Start now uses its installationName parameter (previously discarded).
  • Stats-stream Reset() (util/statsstreamparser) — discards a buffered partial frame and re-initialises the reader, so a desynced/malformed docker stats frame (e.g. runc collector EOF emitting a partial frame) is recoverable instead of fatal.

Deliberately excluded (stay in #442)

  • The buildkit-API rename GCPolicy.KeepBytes → ReservedSpace in buildkitd.go — won't compile against current buildkit.
  • Cosmetic hint.Wrap / printBuildkitInfo line-reflows from the bump's linter.

Verification

  • go build ./... — green.
  • go test ./util/statsstreamparser/ — green (new Reset coverage).
  • go test ./buildkitd/ -run Telemetry — green (×2: env forwarding + no-op without a metrics exporter).

…tics

Signed-off-by: Giles Cope <gilescope@gmail.com>
@gilescope gilescope requested a review from a team as a code owner June 14, 2026 14:39
@gilescope gilescope requested review from janishorsts and removed request for a team June 14, 2026 14:39
@github-actions

Copy link
Copy Markdown

⚠️ Are we earthbuild yet?

Warning: "earthly" occurrences have increased by 5 (0.09%)

📈 Overall Progress

Branch Total Count
main 5320
This PR 5325
Difference +5 (0.09%)

📁 Changes by file type:

File Type Change
Go files (.go) ❌ +5
Documentation (.md) ➖ No change
Earthfiles ➖ No change

Keep up the great work migrating from Earthly to Earthbuild! 🚀

💡 Tips for finding more occurrences

Run locally to see detailed breakdown:

./.github/scripts/count-earthly.sh

Note that the goal is not to reach 0.
There is anticipated to be at least some occurences of earthly in the source code due to backwards compatibility with config files and language constructs.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry metrics for tracking process memory (such as Alloc, HeapAlloc, HeapSys, and Sys) and adds telemetry environment configuration for buildkitd. It also adds a Reset method to the stats stream parser to allow recovery from malformed frames, complete with unit tests. Feedback from the review suggests avoiding if statements with initializers across several files to prevent potential linting issues, and tracking already-seen keys in appendOTELResourceAttributes to avoid duplicate resource attributes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread buildkitd/buildkitd.go
Comment on lines +705 to +709
if _, ok := envOpts["OTEL_METRICS_EXPORTER"]; !ok {
if envOpts["OTEL_EXPORTER_OTLP_ENDPOINT"] != "" || envOpts["OTEL_EXPORTER_OTLP_METRICS_ENDPOINT"] != "" {
envOpts["OTEL_METRICS_EXPORTER"] = "otlp"
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using an if statement with an initializer. Declare the variable on a preceding line to prevent potential conflicts with project linting rules.

Suggested change
if _, ok := envOpts["OTEL_METRICS_EXPORTER"]; !ok {
if envOpts["OTEL_EXPORTER_OTLP_ENDPOINT"] != "" || envOpts["OTEL_EXPORTER_OTLP_METRICS_ENDPOINT"] != "" {
envOpts["OTEL_METRICS_EXPORTER"] = "otlp"
}
}
_, ok := envOpts["OTEL_METRICS_EXPORTER"]
if !ok {
if envOpts["OTEL_EXPORTER_OTLP_ENDPOINT"] != "" || envOpts["OTEL_EXPORTER_OTLP_METRICS_ENDPOINT"] != "" {
envOpts["OTEL_METRICS_EXPORTER"] = "otlp"
}
}
References
  1. In Go, avoid using an if statement with an initializer. Instead, declare the variable on a preceding line to prevent potential conflicts with project linting rules.

Comment thread buildkitd/buildkitd.go
Comment on lines +734 to +759
func appendOTELResourceAttributes(base string, attrs map[string]string) string {
parts := make([]string, 0, len(attrs)+1)

for attr := range strings.SplitSeq(base, ",") {
attr = strings.TrimSpace(attr)
if attr == "" {
continue
}

if _, value, ok := strings.Cut(attr, "="); !ok || strings.TrimSpace(value) == "" {
continue
}

parts = append(parts, attr)
}

for key, value := range attrs {
if value == "" {
continue
}

parts = append(parts, key+"="+value)
}

return strings.Join(parts, ",")
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent duplicate resource attributes when keys are defined in both base and attrs, we should keep track of the keys we have already seen. Additionally, we should avoid using an if statement with an initializer on line 743 to comply with the project's Go guidelines.

func appendOTELResourceAttributes(base string, attrs map[string]string) string {
	parts := make([]string, 0, len(attrs)+1)
	seen := make(map[string]bool)

	for attr := range strings.SplitSeq(base, ",") {
		attr = strings.TrimSpace(attr)
		if attr == "" {
			continue
		}

		key, value, ok := strings.Cut(attr, "=")
		if !ok || strings.TrimSpace(value) == "" {
			continue
		}

		parts = append(parts, attr)
		seen[strings.TrimSpace(key)] = true
	}

	for key, value := range attrs {
		if value == "" || seen[key] {
			continue
		}

		parts = append(parts, key+"="+value)
	}

	return strings.Join(parts, ",")
}
References
  1. In Go, avoid using an if statement with an initializer. Instead, declare the variable on a preceding line to prevent potential conflicts with project linting rules.

Comment on lines +309 to +311
if value, ok := otelResourceAttributeFromEnv(key); ok {
attrs = append(attrs, attribute.String(key, value))
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using an if statement with an initializer. Declare the variable on a preceding line to prevent potential conflicts with project linting rules.

		value, ok := otelResourceAttributeFromEnv(key)
		if ok {
			attrs = append(attrs, attribute.String(key, value))
		}
References
  1. In Go, avoid using an if statement with an initializer. Instead, declare the variable on a preceding line to prevent potential conflicts with project linting rules.

Comment on lines +314 to +316
if target := earthbuildTargetFromArgs(os.Args); target != "" {
attrs = append(attrs, attribute.String("earthbuild.target", target))
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using an if statement with an initializer. Declare the variable on a preceding line to prevent potential conflicts with project linting rules.

Suggested change
if target := earthbuildTargetFromArgs(os.Args); target != "" {
attrs = append(attrs, attribute.String("earthbuild.target", target))
}
target := earthbuildTargetFromArgs(os.Args)
if target != "" {
attrs = append(attrs, attribute.String("earthbuild.target", target))
}
References
  1. In Go, avoid using an if statement with an initializer. Instead, declare the variable on a preceding line to prevent potential conflicts with project linting rules.

Comment on lines +322 to +324
if value, _ := strconv.ParseBool(os.Getenv("EARTHLY_WITH_DOCKER")); value {
return "inner"
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid using an if statement with an initializer. Declare the variable on a preceding line to prevent potential conflicts with project linting rules.

Suggested change
if value, _ := strconv.ParseBool(os.Getenv("EARTHLY_WITH_DOCKER")); value {
return "inner"
}
value, _ := strconv.ParseBool(os.Getenv("EARTHLY_WITH_DOCKER"))
if value {
return "inner"
}
References
  1. In Go, avoid using an if statement with an initializer. Instead, declare the variable on a preceding line to prevent potential conflicts with project linting rules.

Comment thread buildkitd/buildkitd.go
"BUILDKIT_MAX_PARALLELISM": strconv.Itoa(settings.MaxParallelism),
}

withDocker, _ := strconv.ParseBool(os.Getenv("EARTHLY_WITH_DOCKER"))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add error handling? I can see in the original code that it was not handled, but we should.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants