fix: remove project label from projectstorage metrics to reduce cardinality by zachsmith1 · Pull Request #558 · datum-cloud/milo

zachsmith1 · 2026-04-02T04:49:43Z

Summary

Remove the project label from all three projectstorage_* metrics

Problem

The project label creates cardinality explosion in VictoriaMetrics production storage (datum-cloud/infra#2113). PVCs are at 93% capacity.

Metric	Before (per pod)	After (per pod)
`projectstorage_first_ready_seconds`	414 × 82 × 12 = 407K series	82 × 12 = 984 series
`projectstorage_child_creations_total`	414 × 82 = 34K series	82 series
`projectstorage_reinitializing_errors_total`	414 × 82 × 7 = 237K series	82 × 7 = 574 series
Total	~678K / pod, ~6.1M across 9 pods	~1.6K / pod, ~14K across 9 pods

~430x reduction in cardinality.

What changed

Removed project from label dimensions on all three metrics
Removed project field from instrumentedStorage struct
Updated recordFirstReady, incrReinit, and childCreations.WithLabelValues call sites

The distribution by resource_group and resource_kind is the useful signal for understanding storage init performance. Per-project granularity is not actionable and is the source of the cardinality problem.

Test plan

go build ./internal/apiserver/storage/project/ passes
Deploy to staging, verify metrics still emit with reduced labels
Confirm VictoriaMetrics series count drops after old series expire

The project label on projectstorage_first_ready_seconds, projectstorage_child_creations_total, and projectstorage_reinitializing_errors_total creates a cardinality explosion: 414 projects × 82 resource kinds × histogram buckets × 9 pods = ~7.2M series, consuming 28% of all VictoriaMetrics storage in production. Drop the project label from all three metrics. The distribution of storage init latency by resource_group and resource_kind is the useful signal; per-project granularity is not needed and causes the cardinality problem. Reduces total series from ~4.7M to ~11K per pod. Ref: datum-cloud/infra#2113

joggrbot · 2026-04-02T04:49:57Z

📝 Documentation Analysis

All docs are up to date! 🎉

✅ Latest commit analyzed: d54ebef | Powered by Joggr

zachsmith1 requested a review from scotwells April 2, 2026 04:55

scotwells approved these changes Apr 2, 2026

View reviewed changes

zachsmith1 merged commit 8eee3af into main Apr 2, 2026
7 of 9 checks passed

zachsmith1 deleted the fix/reduce-projectstorage-metric-cardinality branch April 2, 2026 15:56

kevwilliams assigned zachsmith1 Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: remove project label from projectstorage metrics to reduce cardinality#558

fix: remove project label from projectstorage metrics to reduce cardinality#558
zachsmith1 merged 1 commit intomainfrom
fix/reduce-projectstorage-metric-cardinality

zachsmith1 commented Apr 2, 2026

Uh oh!

joggrbot bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zachsmith1 commented Apr 2, 2026

Summary

Problem

What changed

Test plan

Uh oh!

joggrbot bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Documentation Analysis

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

joggrbot bot commented Apr 2, 2026 •

edited

Loading