added the pci device#3
Conversation
WalkthroughThis pull request introduces four new cpuset collectors for cgroup v2 monitoring (cpuset.cpus, cpuset.cpus.effective, cpuset.mems, cpuset.mems.effective), adds a range list parser to expand comma-separated ranges into individual metric entries, disables memory.stat collection by default, and removes cpu.pressure and pids collectors from registration. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@collector/collector.go`:
- Around line 278-283: The README "Enabled by default" section is missing the
four newly-registered cpuset collectors that are enabled in code; update the
documentation to include "cpuset.cpus", "cpuset.cpus.effective", "cpuset.mems",
and "cpuset.mems.effective" in the Enabled-by-default table/list to match the
registerCollector calls (see registrations for NewCPUSetCpusCollector,
NewCPUSetCpusEffectiveCollector, NewCPUSetMemsCollector,
NewCPUSetMemsEffectiveCollector).
In `@parsers/parsers.go`:
- Around line 33-36: RangeListCountParser currently hardcodes the entry label as
"_cpu_" which mislabels cpuset.mems metrics; add a new field EntryLabel string
to the RangeListCountParser struct and replace any hardcoded "_cpu_" key
construction in its Parse/Format methods with that EntryLabel (falling back to a
sensible default like "cpu" if empty). Update the callers/collectors that
instantiate RangeListCountParser (where cpuset.mems and cpuset.mems.effective
are created) to set EntryLabel to "mem" (or "mems"/appropriate label) so emitted
metric keys are correct; ensure tests or usages that expect the old key are
updated to use the new label field.
- Around line 145-170: The parser currently accepts negative or reversed ranges,
producing silent/incorrect metrics; update the range and single-value handling
in the block that parses bounds and val (where strconv.Atoi is used and metrics
are set with p.MetricPrefix and keys like metricName+"_cpu_"+strconv.Itoa(...))
to explicitly validate that start and end are non-negative and that start <= end
(log via p.Logger.Error with the input r and a clear message and continue if
validation fails), and similarly validate val is non-negative in the
single-value branch (log and continue on failure) so invalid cpuset content is
rejected instead of generating empty or incorrect metrics.
🧹 Nitpick comments (1)
collector/cpu.go (1)
50-119: Consider a small helper to reduce cpuset collector boilerplate.The four constructors are nearly identical; a helper reduces drift and simplifies future changes.
♻️ Example helper
--- a/collector/cpu.go +++ b/collector/cpu.go @@ +func newCPUSetCollector(file string, logger *slog.Logger, cgroups []string) (Collector, error) { + fileLogger := slog.With(logger, "file", file) + return &Cgroupv2FileCollector{ + gaugeVecs: make(map[string]*prometheus.GaugeVec), + counterVecs: make(map[string]*prometheus.CounterVec), + parser: &parsers.RangeListCountParser{ + MetricPrefix: sanitizeP8sName(file), + Logger: fileLogger, + }, + dirNames: cgroups, + fileName: file, + logger: fileLogger, + isCounter: func(metricName string) bool { return false }, + }, nil +} @@ func NewCPUSetCpusCollector(logger *slog.Logger, cgroups []string) (Collector, error) { - file := "cpuset.cpus" - fileLogger := slog.With(logger, "file", file) - - return &Cgroupv2FileCollector{ - gaugeVecs: make(map[string]*prometheus.GaugeVec), - counterVecs: make(map[string]*prometheus.CounterVec), - parser: &parsers.RangeListCountParser{ - MetricPrefix: sanitizeP8sName(file), - Logger: fileLogger, - }, - dirNames: cgroups, - fileName: file, - logger: fileLogger, - isCounter: func(metricName string) bool { return false }, - }, nil + return newCPUSetCollector("cpuset.cpus", logger, cgroups) } @@ func NewCPUSetCpusEffectiveCollector(logger *slog.Logger, cgroups []string) (Collector, error) { - file := "cpuset.cpus.effective" - fileLogger := slog.With(logger, "file", file) - return &Cgroupv2FileCollector{ ... } + return newCPUSetCollector("cpuset.cpus.effective", logger, cgroups) } @@ func NewCPUSetMemsCollector(logger *slog.Logger, cgroups []string) (Collector, error) { - file := "cpuset.mems" - fileLogger := slog.With(logger, "file", file) - return &Cgroupv2FileCollector{ ... } + return newCPUSetCollector("cpuset.mems", logger, cgroups) } @@ func NewCPUSetMemsEffectiveCollector(logger *slog.Logger, cgroups []string) (Collector, error) { - file := "cpuset.mems.effective" - fileLogger := slog.With(logger, "file", file) - return &Cgroupv2FileCollector{ ... } + return newCPUSetCollector("cpuset.mems.effective", logger, cgroups) }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
collector/collector.gocollector/cpu.goparsers/parsers.go
🧰 Additional context used
🧬 Code graph analysis (1)
collector/collector.go (2)
collector/memory.go (1)
NewMemoryStatCollector(86-104)collector/cpu.go (5)
NewCPUSetCpusCollector(50-66)NewCPUSetCpusEffectiveCollector(68-84)NewCpuStatCollector(11-27)NewCPUSetMemsCollector(86-102)NewCPUSetMemsEffectiveCollector(104-120)
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| registerCollector("memory.stat", defaultDisabled, NewMemoryStatCollector) | ||
| registerCollector("cpuset.cpus", defaultEnabled, NewCPUSetCpusCollector) | ||
| registerCollector("cpuset.cpus.effective", defaultEnabled, NewCPUSetCpusEffectiveCollector) | ||
| registerCollector("cpu.stat", defaultEnabled, NewCpuStatCollector) | ||
| registerCollector("cpu.pressure", defaultEnabled, NewCpuPressureCollector) | ||
| registerCollector("cpuset.mems", defaultEnabled, NewCPUSetMemsCollector) | ||
| registerCollector("cpuset.mems.effective", defaultEnabled, NewCPUSetMemsEffectiveCollector) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /tmp && git log --oneline --all | head -20Repository: asama-ai/cgroupv2_exporter
Length of output: 138
🏁 Script executed:
# First, let's understand the current state and look for git diff to see what actually changed
git diff HEAD~1 HEAD -- collector/collector.go | head -100Repository: asama-ai/cgroupv2_exporter
Length of output: 98
🏁 Script executed:
# Search for documentation references to these collectors
rg -n "memory\.stat|cpuset\.cpus|cpuset\.mems" -S README.md docs || echo "No matches found in docs"Repository: asama-ai/cgroupv2_exporter
Length of output: 184
🏁 Script executed:
# Look for changelog or release notes
fd -e md -e txt | xargs rg -l "memory\.stat|cpuset|behavior.*change|default.*disabled|default.*enabled" 2>/dev/null | head -20Repository: asama-ai/cgroupv2_exporter
Length of output: 79
🏁 Script executed:
# Examine the collector.go file to understand the defaults pattern
cat -n collector/collector.go | sed -n '260,300p'Repository: asama-ai/cgroupv2_exporter
Length of output: 1620
🏁 Script executed:
# Search for where defaultDisabled and defaultEnabled are defined
rg -n "defaultDisabled|defaultEnabled" collector/collector.go | head -30Repository: asama-ai/cgroupv2_exporter
Length of output: 1114
🏁 Script executed:
# Read README.md to see documentation context
cat -n README.md | head -60Repository: asama-ai/cgroupv2_exporter
Length of output: 1452
🏁 Script executed:
# Look for changelog or release notes
fd -e md -e txt -e rst | grep -iE "change|release|news|history" || echo "Checking all markdown files..."
fd -e md | xargs cat | grep -i "memory.stat\|cpuset" | head -20Repository: asama-ai/cgroupv2_exporter
Length of output: 115
🏁 Script executed:
# Search for how collectors are enabled/disabled and if there's CLI flag handling
rg -n "collector.*flag|enable.*collector|disable.*collector" -A 3 | head -50Repository: asama-ai/cgroupv2_exporter
Length of output: 1919
🏁 Script executed:
# Look for the registerCollector function to understand how defaults work
rg -n "func registerCollector" -A 20Repository: asama-ai/cgroupv2_exporter
Length of output: 1353
🏁 Script executed:
# Search for any documentation about default collector behavior
rg -n "defaultEnabled|defaultDisabled" collector/collector.go -B 2 -A 2 | head -40Repository: asama-ai/cgroupv2_exporter
Length of output: 1170
Documentation incomplete: add cpuset collectors to README's "Enabled by default" section.
The code correctly defaults memory.stat to disabled (which matches the README), but registers the new cpuset.cpus, cpuset.cpus.effective, cpuset.mems, and cpuset.mems.effective collectors as enabled by default. These four collectors are missing from the README's "Enabled by default" table and should be added.
🤖 Prompt for AI Agents
In `@collector/collector.go` around lines 278 - 283, The README "Enabled by
default" section is missing the four newly-registered cpuset collectors that are
enabled in code; update the documentation to include "cpuset.cpus",
"cpuset.cpus.effective", "cpuset.mems", and "cpuset.mems.effective" in the
Enabled-by-default table/list to match the registerCollector calls (see
registrations for NewCPUSetCpusCollector, NewCPUSetCpusEffectiveCollector,
NewCPUSetMemsCollector, NewCPUSetMemsEffectiveCollector).
| type RangeListCountParser struct { | ||
| MetricPrefix string | ||
| Logger *slog.Logger | ||
| } |
There was a problem hiding this comment.
Make the entry label configurable (cpuset.mems is currently mislabeled as cpu).
RangeListCountParser always emits keys with _cpu_, but the same parser is used by cpuset.mems and cpuset.mems.effective. That mislabels memory node metrics and can confuse dashboards/alerts.
🔧 Proposed fix (add configurable label and set it in collectors)
--- a/parsers/parsers.go
+++ b/parsers/parsers.go
@@
type RangeListCountParser struct {
MetricPrefix string
Logger *slog.Logger
+ EntryLabel string // e.g., "cpu" or "mem"
}
@@
- for i := start; i <= end; i++ {
- metricName := p.MetricPrefix // e.g., "cpuset_cpu_enabled"
- // Key includes CPU as pseudo-label
- metrics[metricName+"_cpu_"+strconv.Itoa(i)] = 1
- }
+ label := p.EntryLabel
+ if label == "" {
+ label = "cpu"
+ }
+ for i := start; i <= end; i++ {
+ metricName := p.MetricPrefix
+ metrics[metricName+"_"+label+"_"+strconv.Itoa(i)] = 1
+ }
@@
- metricName := p.MetricPrefix
- metrics[metricName+"_cpu_"+strconv.Itoa(val)] = 1
+ label := p.EntryLabel
+ if label == "" {
+ label = "cpu"
+ }
+ metricName := p.MetricPrefix
+ metrics[metricName+"_"+label+"_"+strconv.Itoa(val)] = 1--- a/collector/cpu.go
+++ b/collector/cpu.go
@@
parser: &parsers.RangeListCountParser{
MetricPrefix: sanitizeP8sName(file),
Logger: fileLogger,
+ EntryLabel: "cpu",
},
@@
parser: &parsers.RangeListCountParser{
MetricPrefix: sanitizeP8sName(file),
Logger: fileLogger,
+ EntryLabel: "cpu",
},
@@
parser: &parsers.RangeListCountParser{
MetricPrefix: sanitizeP8sName(file),
Logger: fileLogger,
+ EntryLabel: "mem",
},
@@
parser: &parsers.RangeListCountParser{
MetricPrefix: sanitizeP8sName(file),
Logger: fileLogger,
+ EntryLabel: "mem",
},Also applies to: 156-170
🤖 Prompt for AI Agents
In `@parsers/parsers.go` around lines 33 - 36, RangeListCountParser currently
hardcodes the entry label as "_cpu_" which mislabels cpuset.mems metrics; add a
new field EntryLabel string to the RangeListCountParser struct and replace any
hardcoded "_cpu_" key construction in its Parse/Format methods with that
EntryLabel (falling back to a sensible default like "cpu" if empty). Update the
callers/collectors that instantiate RangeListCountParser (where cpuset.mems and
cpuset.mems.effective are created) to set EntryLabel to "mem" (or
"mems"/appropriate label) so emitted metric keys are correct; ensure tests or
usages that expect the old key are updated to use the new label field.
| start, err := strconv.Atoi(bounds[0]) | ||
| if err != nil { | ||
| p.Logger.Error("invalid start in range", "input", r, "err", err) | ||
| continue | ||
| } | ||
| end, err := strconv.Atoi(bounds[1]) | ||
| if err != nil { | ||
| p.Logger.Error("invalid end in range", "input", r, "err", err) | ||
| continue | ||
| } | ||
|
|
||
| for i := start; i <= end; i++ { | ||
| metricName := p.MetricPrefix // e.g., "cpuset_cpu_enabled" | ||
| // Key includes CPU as pseudo-label | ||
| metrics[metricName+"_cpu_"+strconv.Itoa(i)] = 1 | ||
| } | ||
|
|
||
| } else { | ||
| val, err := strconv.Atoi(r) | ||
| if err != nil { | ||
| p.Logger.Error("invalid value", "input", r, "err", err) | ||
| continue | ||
| } | ||
| metricName := p.MetricPrefix | ||
| metrics[metricName+"_cpu_"+strconv.Itoa(val)] = 1 | ||
| } |
There was a problem hiding this comment.
Validate range bounds (negative or reversed ranges).
Negative indices and reversed ranges silently produce empty metrics. It’s safer to reject them explicitly so the collector surfaces invalid cpuset content.
🔍 Suggested guardrails
@@
start, err := strconv.Atoi(bounds[0])
if err != nil {
p.Logger.Error("invalid start in range", "input", r, "err", err)
continue
}
end, err := strconv.Atoi(bounds[1])
if err != nil {
p.Logger.Error("invalid end in range", "input", r, "err", err)
continue
}
+ if start < 0 || end < 0 || start > end {
+ p.Logger.Error("invalid range bounds", "input", r, "start", start, "end", end)
+ continue
+ }
@@
- val, err := strconv.Atoi(r)
+ val, err := strconv.Atoi(r)
if err != nil {
p.Logger.Error("invalid value", "input", r, "err", err)
continue
}
+ if val < 0 {
+ p.Logger.Error("invalid value (negative)", "input", r, "value", val)
+ continue
+ }🤖 Prompt for AI Agents
In `@parsers/parsers.go` around lines 145 - 170, The parser currently accepts
negative or reversed ranges, producing silent/incorrect metrics; update the
range and single-value handling in the block that parses bounds and val (where
strconv.Atoi is used and metrics are set with p.MetricPrefix and keys like
metricName+"_cpu_"+strconv.Itoa(...)) to explicitly validate that start and end
are non-negative and that start <= end (log via p.Logger.Error with the input r
and a clear message and continue if validation fails), and similarly validate
val is non-negative in the single-value branch (log and continue on failure) so
invalid cpuset content is rejected instead of generating empty or incorrect
metrics.
Summary by CodeRabbit
New Features
Changes
✏️ Tip: You can customize this high-level summary in your review settings.