Skip to content

fix: lower GetPodClique NotFound log to V(1) during expected cascade-delete#636

Open
sridhar-3009 wants to merge 1 commit into
ai-dynamo:mainfrom
sridhar-3009:fix/noisy-podclique-notfound-logs
Open

fix: lower GetPodClique NotFound log to V(1) during expected cascade-delete#636
sridhar-3009 wants to merge 1 commit into
ai-dynamo:mainfrom
sridhar-3009:fix/noisy-podclique-notfound-logs

Conversation

@sridhar-3009
Copy link
Copy Markdown

Problem

Resolves #622.

During normal PodCliqueSet cascade-delete, the PodClique controller receives reconcile requests for already-deleted PodCliques and logs at info level:

{"level":"info","msg":"PodClique not found","objectKey":{"name":"...","namespace":"default"}}

At scale this produces one log line per deleted PodClique — pure noise with no actionable signal.

Fix

GetPodClique already accepts an ignoreNotFound bool parameter to signal that the caller considers NotFound expected. Change the log level in that branch from Info to V(1) (verbose/debug). The log message is updated to clarify it is during cascade-delete.

Unexpected missing PodCliques (ignoreNotFound=false) still propagate as errors through ReconcileWithErrors, preserving visibility.

Tests

Added TestGetPodClique_IgnoreNotFoundFalse to cover the error-propagation path — ignoreNotFound=false on a missing object must return an error result, not silently succeed.

Before / After

Before: Every deleted PodClique during a cascade emits one info log.

After: Those logs appear only at V(1) (hidden unless verbose logging is enabled). Unexpected missing PodCliques still surface as errors.

…delete

During a normal PodCliqueSet cascade-delete, the PodClique controller
receives reconcile requests for already-deleted PodCliques. The helper
was logging each at info level, producing one log line per deleted
PodClique at scale — significant noise with no actionable signal.

Change: when ignoreNotFound=true (the caller has signalled the NotFound
is expected), log at V(1) instead of Info. Unexpected missing PodCliques
— where ignoreNotFound=false — are still propagated as errors so the
owning reconciler can surface them.

Add TestGetPodClique_IgnoreNotFoundFalse to cover the error-propagation
path (ignoreNotFound=false on a missing object must not silently swallow
the error).

Closes ai-dynamo#622
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jun 1, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce noisy PodClique NotFound logs during normal cascade-delete flow

1 participant