Requeue zone update when context is cancelled#1965
Conversation
There was a problem hiding this comment.
2 issues found across 8 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="internal/controller/operator/controllers_test.go">
<violation number="1" location="internal/controller/operator/controllers_test.go:355">
P2: This test uses an unreachable `context.Canceled`+`ErrZone` error shape, so it does not verify the real requeue-on-zone-cancel path.</violation>
</file>
<file name="internal/controller/operator/factory/vmdistributed/zone.go">
<violation number="1" location="internal/controller/operator/factory/vmdistributed/zone.go:406">
P1: `WithCancelCause` alone is not enough here: `wait.PollUntilContextCancel` only returns `ctx.Err()`, so `ErrZone` never reaches the reconcile code and the new requeue path will not trigger.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
998bb02 to
3e2db8b
Compare
…ing for vmagent queue to be drained
303cbe5 to
d35cbc1
Compare
|
this solution most likely doesn't cover a case described in an issue, when VMCluster reconciliation for some reason returned context.Canceled and VMDistributed waits for it's readiness forever |
| contextCancelErrorsTotal.Inc() | ||
| var errZone *vmdistributed.ErrZone | ||
| if errors.As(err, &errZone) { | ||
| return ctrl.Result{Requeue: true}, nil |
There was a problem hiding this comment.
why not just ignore cause from cmd/main.go and requeue for all others cases?
in this case other causes are not needed
There was a problem hiding this comment.
Good idea, thanks
There was a problem hiding this comment.
also let's keep only this cancelWithCause and drop the rest
There was a problem hiding this comment.
I mean all WithCancelCause added in this PR. Let's drop the rest besides one, which actually impacts reconcile behaviour, #1964 initially for a different purpose, it keeps only one function for reconcile errors handling and processes all reconcile errors in this function
7303054 to
62338e1
Compare
62338e1 to
fa3e1b3
Compare
Attach a more detailed error every time we cancel the context. Requeue the request if the cancellation occurred during zone processing. This would prevent some zones from being left untouched, as otherwise the controller would restart from scratch.
Fixes #1962