fix: reduce noisy PodClique NotFound logs during cascade-delete#641
Open
AsadShahid04 wants to merge 1 commit into
Open
fix: reduce noisy PodClique NotFound logs during cascade-delete#641AsadShahid04 wants to merge 1 commit into
AsadShahid04 wants to merge 1 commit into
Conversation
During normal PCS cascade-delete, the PodClique controller receives reconcile requests for already-deleted PodCliques and emitted one info-level "PodClique not found" log per deleted object, producing significant log noise at scale. Lower the GetPodClique not-found log from Info to V(1) (debug) so expected cascade-delete reconciles are silent at the default log level. Add a contextual Info log in getMinAvailableBreachedPCLQsNotInPCSG so that when a PodClique expected by a live PodCliqueSet replica is missing (unexpected while the parent is not deleting) the situation remains visible: the log records which PodCliques are absent and which replica index was skipped for MinAvailable evaluation. Extend the reconciler_test.go to cover GetPodClique with ignoreNotFound=false, confirming the error is propagated correctly. Closes ai-dynamo#622 Signed-off-by: OpenClaw Agent <agent@openclaw.local>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Problem
After the cascade-delete change in #556, deleting a PodCliqueSet removes the PCS and its owned PodCliques. The PodClique controller then receives reconcile events for already-deleted PodCliques and logs a info-level "PodClique not found" message for each one. At scale (e.g. 5000-replica workloads) this produces thousands of expected but noisy log entries during normal deletion.
Solution
operator/internal/controller/utils/reconciler.go — Change the GetPodClique not-found branch from logger.Info to logger.V(1).Info. Normal operators run at verbosity 0, so this message disappears from default logs entirely while remaining accessible with -v=1.
operator/internal/controller/podcliqueset/components/podcliquesetreplica/gangterminate.go — In getMinAvailableBreachedPCLQsNotInPCSG, the function already collects the names of expected PodCliques that are missing. Previously it silently skipped the replica's MinAvailable evaluation. Now it first emits an Info log naming the missing PodCliques and the affected replica index. This is the correct level here: the PCS is alive and not deleting, so a missing PodClique is a transient but observable state (e.g. creation still in flight), not a guaranteed cascade-delete.
Testing
Closes #622