Skip to content

test(compat/loki): pin k8s-family stream labels (pod/namespace/service_name/cluster/container) in seeder#715

Closed
tsouza wants to merge 1 commit into
mainfrom
fix/loki-seeder-k8s-labels
Closed

test(compat/loki): pin k8s-family stream labels (pod/namespace/service_name/cluster/container) in seeder#715
tsouza wants to merge 1 commit into
mainfrom
fix/loki-seeder-k8s-labels

Conversation

@tsouza
Copy link
Copy Markdown
Owner

@tsouza tsouza commented May 22, 2026

Summary

PR #712 retires the historic cerberus-test-queries.yml should_skip overlay, surfacing four "Plain seed-gap" rows whose rationale comments pinned them to the seeder missing the five k8s-family stream labels:

  • exhaustive/aggregations.yaml#Count aggregated by pod
  • exhaustive/aggregations.yaml#Count aggregated by namespace
  • exhaustive/aggregations.yaml#Count aggregated by service_name
  • exhaustive/aggregations.yaml#Count aggregated by cluster and namespace
  • exhaustive/aggregations.yaml#Count aggregated by service_name and container
  • exhaustive/unwrap-aggregations.yaml#Without multiple labels (sum without (namespace, cluster))

Investigation: the seeder's shape has been correct since PR #525 hoisted the stream-identity labels into ResourceAttributes. All five required keys are present in both the pushLoki stream label set (so reference Loki indexes the streams) and the insertCHLogs ResourceAttributes map (so cerberus's LogQL group-by lowering resolves ResourceAttributes[<key>] via logql.levelAwareGroupKey). The should_skip rationale comments were stale — they survived past PR #525 without re-validation.

This PR pins the shape with a unit-level regression test (TestSeederWritesK8sStreamLabels) so any future trim of the seed map surfaces at PR-review time rather than on the compatibility lane (which is where every previous regression of this shape was caught). The seeder itself gains clarifying comments at both call sites (buildStreams for the Loki push, insertCHLogs for the CH-side ResourceAttributes) tying each call-site to the corpus rows it underpins.

What the test asserts

For every stream produced by buildStreams:

  • The pushLoki label map has all five required keys (pod, namespace, service_name, cluster, container) with non-empty values.
  • The matching insertCHLogs ResourceAttributes map (re-built from the same serviceConfig) has all five required keys with non-empty values.
  • The per-key values are identical across the two maps — silent value drift would surface in the differ as a value-mismatch shape instead of the more legible empty-series shape.

Labels touched + how

Label pushLoki stream label insertCHLogs ResourceAttributes
pod already present already present
namespace already present already present
service_name already present already present
cluster already present already present
container already present already present

No values change. The seeder shape is identical pre/post this PR; only the comments and the test are new.

Test plan

  • CI check runs TestSeederWritesK8sStreamLabels and it passes.
  • CI compatibility/loki runs the four corpus rows above without the retired should_skip overlay and the differ reports parity (depends on PR chore(tests): delete every test-suite escape-hatch / allowlist mechanism #712 landing first; this PR is the seeder-side guarantee that closes the diagnosis loop).
  • CI forbid-skip, lint, compose-smoke stay green — no source changes outside the compat-loki seeder package.

🤖 Generated with Claude Code

…n seeder

The four "Plain seed-gap" rows in cerberus-test-queries.yml that PR #712
retires (exhaustive/aggregations.yaml#Count aggregated by pod / namespace
/ service_name / cluster and namespace / service_name and container, plus
exhaustive/unwrap-aggregations.yaml#Without multiple labels) require the
seeder to write the five k8s-family labels (pod, namespace, service_name,
cluster, container) into both:

  - the pushLoki stream label set (so the reference Loki indexes the
    stream under each key), AND
  - the insertCHLogs ResourceAttributes map (so cerberus's LogQL
    group-by lowering resolves ResourceAttributes[<key>] to the
    matching per-stream value via logql.levelAwareGroupKey).

The shape was already correct after PR #525 hoisted the stream-identity
labels into ResourceAttributes; this commit pins it so a future trim of
the seed map fails at unit-test review rather than on the compatibility
lane (which is where every previous regression of this shape surfaced).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@tsouza tsouza enabled auto-merge (squash) May 22, 2026 21:46
@tsouza
Copy link
Copy Markdown
Owner Author

tsouza commented May 23, 2026

Superseded by #721 which replaces the seed-settle heuristics with authoritative /metrics polling. The label-presence regression test #715 added is no longer load-bearing since #721's gate uses metric deltas rather than cardinality. Closing.

@tsouza tsouza closed this May 23, 2026
auto-merge was automatically disabled May 23, 2026 07:08

Pull request was closed

@tsouza tsouza deleted the fix/loki-seeder-k8s-labels branch May 23, 2026 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant