Skip to content

controllers: use single error handling for all CRs#1964

Merged
AndrewChubatiuk merged 1 commit intomasterfrom
controllers-use-single-error-handling
Mar 17, 2026
Merged

controllers: use single error handling for all CRs#1964
AndrewChubatiuk merged 1 commit intomasterfrom
controllers-use-single-error-handling

Conversation

@AndrewChubatiuk
Copy link
Contributor

@AndrewChubatiuk AndrewChubatiuk commented Mar 13, 2026

Summary by cubic

Unifies error handling across all controllers and standardizes how status metadata is read from CRs. Adds graceful shutdown handling and better error propagation while keeping NotFound silent and requeuing on cancellations and conflicts.

  • Refactors

    • Removed handleReconcileErrWithoutStatus; all controllers now use a single handleReconcileErr.
    • Moved GetStatusMetadata from Status structs to parent CRs; updated helpers (waitForStatus, status updates) to read metadata from the CR.
    • Added GetStatus/DefaultStatusFields for VMAlertmanagerConfig, VMRule, VMUser, and scrape CRs (VMNodeScrape, VMPodScrape, VMProbe, VMScrapeConfig, VMServiceScrape, VMStaticScrape); standardized controllers to use value instances and pass pointers.
  • Bug Fixes

    • Requeue on context.Canceled unless caused by graceful shutdown (operator.ErrShutdown); keep NotFound silent and back off on conflicts.
    • Emit events only on real errors; consistent error metrics across all CRs.
    • Stop reconciliation on invalid specs via spec.ParsingError; VMRule and VMUser capture JSON parse errors during unmarshal.
    • Propagate errors from related-resource updates while continuing other items (VMAgent/VMSingle scrapes, alert/rule config updates).

Written for commit 551ba04. Summary will update on new commits.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 49 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="api/operator/v1beta1/vmrule_types.go">

<violation number="1" location="api/operator/v1beta1/vmrule_types.go:249">
P1: Returning nil here hides VMRule JSON decode failures from admission; malformed rules can now be accepted and only fail later in reconciliation.</violation>
</file>

<file name="internal/controller/operator/vmrule_controller.go">

<violation number="1" location="internal/controller/operator/vmrule_controller.go:122">
P1: Don't only log configmap update failures here. A later successful iteration clears the shared named `err`, so this reconcile can return success after partially failing to update `VMAlert` configmaps.</violation>
</file>

<file name="internal/controller/operator/vmuser_controller.go">

<violation number="1" location="internal/controller/operator/vmuser_controller.go:131">
P1: Propagate this `CreateOrUpdateConfig` failure instead of only logging it; otherwise the reconcile returns success and controller-runtime will not retry the VMUser update.</violation>
</file>

<file name="internal/controller/operator/vmalertmanagerconfig_controller.go">

<violation number="1" location="internal/controller/operator/vmalertmanagerconfig_controller.go:75">
P2: Treat `ParsingError` as non-retryable here; returning it through `handleReconcileErr` will put malformed configs into a permanent requeue/event loop.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk force-pushed the controllers-use-single-error-handling branch from a1f67e9 to 66678ab Compare March 13, 2026 13:59
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/controller/operator/controllers.go">

<violation number="1" location="internal/controller/operator/controllers.go:133">
P2: Check `context.Cause(ctx)` here instead of `err`; `WithCancelCause` stores `ErrShutdown` on the context, so graceful shutdowns will usually still be requeued.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk force-pushed the controllers-use-single-error-handling branch 2 times, most recently from 00bc895 to 694ce24 Compare March 13, 2026 20:54
@AndrewChubatiuk AndrewChubatiuk force-pushed the controllers-use-single-error-handling branch 2 times, most recently from 2987f8d to f10cba6 Compare March 17, 2026 11:14
resultErr = err
}
if object != nil && !reflect.ValueOf(object).IsNil() && object.GetNamespace() != "" {
if err != nil && object != nil && !reflect.ValueOf(object).IsNil() && object.GetNamespace() != "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not resultErr? Not sure if we want to emit a new event on every GET failure and parsing error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed resultErr, initially it was added for a different reason

@AndrewChubatiuk AndrewChubatiuk force-pushed the controllers-use-single-error-handling branch from f10cba6 to 551ba04 Compare March 17, 2026 12:02
@AndrewChubatiuk AndrewChubatiuk merged commit 85d57d5 into master Mar 17, 2026
5 checks passed
@AndrewChubatiuk AndrewChubatiuk deleted the controllers-use-single-error-handling branch March 17, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants