fix(promql): union histogram + sum scans for _count/_sum suffixed metrics#716
Merged
Merged
Conversation
…rics PR #710's classic-histogram companion-suffix routing forced the scan to the histogram table only, returning empty for OTel-hostmetrics counters that ship under suffixed names in the sum table (system_cpu_logical_count, system_processes_count, system_filesystem_inodes_count, system_processes_created_count, ...). schema.Metrics.TablesFor now returns [Histogram, Sum] for _count/_sum suffixes. The lowering emits a chplan.UnionAll of two per-arm Projects: the histogram arm filters on the BARE name and projects toFloat64(Count or Sum) AS Value with MetricName synthesized as the suffixed literal; the sum arm filters on the SUFFIXED name and passes Value through. The existing merge() table-function path can't fan disjoint-schema tables (histogram has Count/Sum but no Value; sum has Value but no Count), so the per-arm UnionAll is the only way to address both physical layouts in one query. Verified pre-fix empty and post-fix non-empty on the live compose stack against system_cpu_logical_count. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3 tasks
tsouza
added a commit
that referenced
this pull request
May 22, 2026
… (#720) PR #706 / #707 added per-arm canonical-shape projection + instant-mode TimeUnix synthesis to fix SQL emission errors on nested PromQL or-chains. The 2026-05-22 dirty-fixes audit asked whether these helpers became unreachable after PR #710 (TableFor -> TablesFor for unsuffixed OTel sums) and PR #716 (extended to _count/_sum suffixes). Reachability test on the post-#710/#716 tree lowered the PR #706 failing-shape query and recorded vectorSetOpCanonicalArmFrag invocations on both arms with derived=true and the synthesised anchor branch taken. The workarounds remain on the live emit path because #710/#716 fix which Scan tables get resolved (orthogonal to UNION ALL column-type unification + missing-identifier-on-derived-inner). Adds a NOTE block to vectorSetOpCanonicalArmFrag's docstring documenting the audit so a future reviewer does not delete the helpers on the assumption they were per-shape patches for the gauge/sum routing bug.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Direct follow-up to PR #710. The classic-histogram companion-suffix
routing introduced there forced the scan to the histogram table only,
returning empty for OTel-hostmetrics counters that ship under suffixed
names in the sum table —
system_cpu_logical_count,system_processes_count,system_filesystem_inodes_count,system_processes_created_count, etc.schema.Metrics.TablesFornow returns[Histogram, Sum]for_count/_sumsuffixed names (unchanged for_total/_bucket).chplan.UnionAllnode (N-way arms, emits as(SELECT ...) UNION ALL (SELECT ...)) wired through the chsqlemitter +
test/spec/chplan_print.goIR snapshot helper.lowerVectorSelectorbuilds the union when both physical layoutsapply: the histogram arm filters on the BARE name and projects
toFloat64(Count|Sum) AS ValuewithMetricNamesynthesised as thesuffixed literal; the sum arm filters on the SUFFIXED name and
passes
Valuethrough. Single-arm fallback (no Sum, or Sum equalsHistogram by config) preserves PR fix(promql): union (gauge, sum) scan so OTel-emitter cumulative sums under bare names resolve #710's byte-stable shape.
The existing
merge()table-function path can't fan disjoint-schematables (histogram has
Count/Sumbut noValue; sum hasValuebutno
Count), so the per-armUnionAllis the only way to address bothphysical layouts in one query.
Live-stack verification
Compose stack at this branch's HEAD, with
system_cpu_logical_countdata present in
otel_metrics_sum:curl ... 'query=system_cpu_logical_count'→{"status":"success","data":{"resultType":"vector","result":[]}}[{"metric":{"__name__":"system_cpu_logical_count"},"value":[1779485700,"8"]}]Pre-existing histogram-companion queries (
cerberus_queries_duration_seconds_count,http_server_request_duration_count) still resolve correctly. Pre-existingunsuffixed cases (
otelcol_process_uptime) still resolve correctly.Test plan
go test ./...— 4105 passed across 38 packagesgo test ./test/spec/promql/ -tags=chdb— 269 passed (chDB round-trip exercises the new fixture's union shape)golangci-lint run ./...— no issuestest/spec/promql/scan_unions_histogram_sum_for_suffixed_metric.txtarpins the multi-table union shape end-to-end (plan IR + SQL + chDB roundtrip with an empty histogram table + populated sum table — a regression that dropped the sum arm returns zero rows).TestLower_HistogramCompanion_RoutesToHistogramAndSumUnioncovers all four_count/_sumshapes plus the OTel-hostmetrics-stylesystem_cpu_logical_countcase.TestDefaultOTelMetricsTablesForcovers the new[Histogram, Sum]candidate set with explicit OTel-hostmetrics names.histogram_count_basic/histogram_sum_basic/histogram_mean_latency/rate_http_countregenerated to reflect the new two-arm union plan.chplan.UnionAllEqual-invariant coverage added underinternal/chplan/equal_invariants_test.go(positive + 4 negatives: length, content, order, type).system_cpu_logical_countresolves end-to-end through the cerberus Prom API at the new image.