Recommended approaches to handle high cardinality from workload metrics (e.g., qos_read_io_type)? #4247

songlin-rgb · 2026-04-23T13:13:00Z

songlin-rgb
Apr 23, 2026

Hi Harvest Team,

We are using Harvest to monitor our NetApp storage systems. However, we have run into high cardinality issues when enabling workload.yaml and workload_volume.yaml. Our monitoring team cautions that high cardinality is an anti-pattern in the world of Observability.

Specifically, metrics like qos_read_io_type can emit timeseries with labels for each volume and metric type: qos_read_io_type{..., volume="volume_A", ..., metric="hya_non_cache"}

For environments with thousands of volumes, this combinatorial explosion (number of volumes × distinct values for the metric label such as cache, disk, bamboo_ssd) results in extremely high cardinality that easily exceeds our telemetry ingestion limits (e.g., 50,000 timeseries).

Furthermore, we noticed that many of these timeseries continuously report a value of 0. For now, we have disabled workload.yaml and workload_volume.yaml for multiple reasons, although we would ideally like to re-enable them in the future.

We would love your suggestions on:

Are there recommended best practices or Harvest features to mitigate high cardinality for volume-level metrics?
Is it possible to relabel, filter or drop zero-valued timeseries at the Harvest collection level before exporting? We use Prometheus Exporter.
How do other users typically approach handling high cardinality in environments with large numbers of volumes when leveraging workload.yaml and workload_volume.yaml?

Answered by cgrinds

Apr 23, 2026

hi @songlin-rgb, good question. Here are some general thoughts on high cardinality.

In general, we disable templates that are known to cause cardinality issues. e.g. CIFSSession, CIFSShare, NetConnections, etc.
Sometimes dropping a metric by updating the template to not export it makes sense. It probably does not in the workload case.
Drop zero-valued timeseries at the Prometheus/Victoria Metrics(VM) collector level. Harvest doesn't natively support dropping zero-valued metrics at collection time today (might be good feature drop_if_zero: true?), but you can do that at ingest via recording rules or metric_relabel_configs. You could also use this to only keep metrics for specific volumes,…

View full answer

cgrinds · 2026-04-23T15:52:41Z

cgrinds
Apr 23, 2026
Maintainer

hi @songlin-rgb, good question. Here are some general thoughts on high cardinality.

In general, we disable templates that are known to cause cardinality issues. e.g. CIFSSession, CIFSShare, NetConnections, etc.
Sometimes dropping a metric by updating the template to not export it makes sense. It probably does not in the workload case.
Drop zero-valued timeseries at the Prometheus/Victoria Metrics(VM) collector level. Harvest doesn't natively support dropping zero-valued metrics at collection time today (might be good feature drop_if_zero: true?), but you can do that at ingest via recording rules or metric_relabel_configs. You could also use this to only keep metrics for specific volumes, SVMs, or metric types.
Consider using VictoriaMetrics instead of Prometheus. VM handles high-cardinality better than Prometheus.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommended approaches to handle high cardinality from workload metrics (e.g., qos_read_io_type)? #4247

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Recommended approaches to handle high cardinality from workload metrics (e.g., qos_read_io_type)? #4247

Uh oh!

songlin-rgb Apr 23, 2026

Replies: 1 comment

Uh oh!

cgrinds Apr 23, 2026 Maintainer

songlin-rgb
Apr 23, 2026

cgrinds
Apr 23, 2026
Maintainer