Skip to content

PR for the next 0.68.x release#1979

Open
AndrewChubatiuk wants to merge 1 commit intorelease-0.68from
release-0.68-next-release
Open

PR for the next 0.68.x release#1979
AndrewChubatiuk wants to merge 1 commit intorelease-0.68from
release-0.68-next-release

Conversation

@AndrewChubatiuk
Copy link
Contributor

@AndrewChubatiuk AndrewChubatiuk commented Mar 17, 2026

PR for next 0.68.x release


Summary by cubic

Make PVC handling safer by waiting until claims are bound and fully resized before proceeding. Adds configurable wait intervals/timeouts and standardizes readiness waits across reconcile paths.

  • New Features

    • Added PVC readiness polling config: VM_PVC_WAIT_READY_INTERVAL, VM_PVC_WAIT_READY_TIMEOUT.
    • Added VM CR status polling config: VM_WAIT_READY_INTERVAL.
    • Documented in docs/env.md; existing VM_APPREADYTIMEOUT, VM_PODWAITREADYINTERVALCHECK, VM_PODWAITREADYTIMEOUT are now applied consistently.
  • Bug Fixes

    • Wait for PVC to be Bound, capacity >= requested, and not Resizing in both PVC reconcile and StatefulSet PVC expansion.
    • Added conflict-retry around PVC get/update to avoid resourceVersion races.
    • Refactored reconcile.Init to accept *config.BaseOperatorConf; Deployments/DaemonSets/StatefulSets now use unified appWaitReadyTimeout and podWaitReadyInterval.

Written for commit 1efa39b. Summary will update on new commits.

@AndrewChubatiuk AndrewChubatiuk changed the base branch from master to release-0.68 March 17, 2026 13:37
@AndrewChubatiuk AndrewChubatiuk changed the title Release 0.68 next release PR for next 0.68.x release Mar 17, 2026
@AndrewChubatiuk AndrewChubatiuk changed the title PR for next 0.68.x release PR for the next 0.68.x release Mar 17, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 15 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/env.md">

<violation number="1" location="docs/env.md:237">
P3: Narrow this description: `VM_WAIT_READY_INTERVAL` does not apply to all VM CRs, only to the `waitForStatus` loop used for VMAgent, VMCluster, and VMAuth.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/statefulset_pvc_expand_test.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset_pvc_expand_test.go:163">
P2: This helper switch makes PVC expansion tests auto-complete by mutating `Status.Capacity` during `Update`, so the resize/wait path is no longer tested realistically.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go:127">
P1: This wait uses the pre-update PVC size, so resized claims can be treated as ready before expansion has completed.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/pvc.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/pvc.go:67">
P2: `waitForPVCReady` treats an unprovisioned PVC as ready by returning success when `status.capacity` is empty. Keep polling instead, otherwise new PVCs bypass the new readiness wait entirely.</violation>
</file>

<file name="internal/controller/operator/factory/k8stools/interceptors.go">

<violation number="1" location="internal/controller/operator/factory/k8stools/interceptors.go:49">
P2: Creating VMAuth/VMCluster/VMAgent no longer persists the mocked status, so tests that read them back after `Create` will see empty status.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

| VM_PODWAITREADYTIMEOUT: `80s` <a href="#variables-vm-podwaitreadytimeout" id="variables-vm-podwaitreadytimeout">#</a><br>Defines single pod deadline to wait for transition to ready state |
| VM_PVC_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-pvc-wait-ready-interval" id="variables-vm-pvc-wait-ready-interval">#</a><br>Defines poll interval for PVC ready check |
| VM_PVC_WAIT_READY_TIMEOUT: `80s` <a href="#variables-vm-pvc-wait-ready-timeout" id="variables-vm-pvc-wait-ready-timeout">#</a><br>Defines poll timeout for PVC ready check |
| VM_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-wait-ready-interval" id="variables-vm-wait-ready-interval">#</a><br>Defines poll interval for VM CRs |
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Narrow this description: VM_WAIT_READY_INTERVAL does not apply to all VM CRs, only to the waitForStatus loop used for VMAgent, VMCluster, and VMAuth.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs/env.md, line 237:

<comment>Narrow this description: `VM_WAIT_READY_INTERVAL` does not apply to all VM CRs, only to the `waitForStatus` loop used for VMAgent, VMCluster, and VMAuth.</comment>

<file context>
@@ -230,7 +230,10 @@
+| VM_PODWAITREADYTIMEOUT: `80s` <a href="#variables-vm-podwaitreadytimeout" id="variables-vm-podwaitreadytimeout">#</a><br>Defines single pod deadline to wait for transition to ready state |
+| VM_PVC_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-pvc-wait-ready-interval" id="variables-vm-pvc-wait-ready-interval">#</a><br>Defines poll interval for PVC ready check |
+| VM_PVC_WAIT_READY_TIMEOUT: `80s` <a href="#variables-vm-pvc-wait-ready-timeout" id="variables-vm-pvc-wait-ready-timeout">#</a><br>Defines poll timeout for PVC ready check |
+| VM_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-wait-ready-interval" id="variables-vm-wait-ready-interval">#</a><br>Defines poll interval for VM CRs |
 | VM_FORCERESYNCINTERVAL: `60s` <a href="#variables-vm-forceresyncinterval" id="variables-vm-forceresyncinterval">#</a><br>configures force resync interval for VMAgent, VMAlert, VMAlertmanager and VMAuth. |
 | VM_ENABLESTRICTSECURITY: `false` <a href="#variables-vm-enablestrictsecurity" id="variables-vm-enablestrictsecurity">#</a><br>EnableStrictSecurity will add default `securityContext` to pods and containers created by operator Default PodSecurityContext include: 1. RunAsNonRoot: true 2. RunAsUser/RunAsGroup/FSGroup: 65534 '65534' refers to 'nobody' in all the used default images like alpine, busybox. If you're using customize image, please make sure '65534' is a valid uid in there or specify SecurityContext. 3. FSGroupChangePolicy: &onRootMismatch If KubeVersion>=1.20, use `FSGroupChangePolicy="onRootMismatch"` to skip the recursive permission change when the root of the volume already has the correct permissions 4. SeccompProfile:      type: RuntimeDefault Use `RuntimeDefault` seccomp profile by default, which is defined by the container runtime, instead of using the Unconfined (seccomp disabled) mode. Default container SecurityContext include: 1. AllowPrivilegeEscalation: false 2. ReadOnlyRootFilesystem: true 3. Capabilities:      drop:        - all turn off `EnableStrictSecurity` by default, see https://github.com/VictoriaMetrics/operator/issues/749 for details |
</file context>
Suggested change
| VM_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-wait-ready-interval" id="variables-vm-wait-ready-interval">#</a><br>Defines poll interval for VM CRs |
| VM_WAIT_READY_INTERVAL: `5s` <a href="#variables-vm-wait-ready-interval" id="variables-vm-wait-ready-interval">#</a><br>Defines poll interval for status checks of VMAgent, VMCluster and VMAuth CRs |
Fix with Cubic

@AndrewChubatiuk AndrewChubatiuk force-pushed the release-0.68-next-release branch from 6afecdc to 1efa39b Compare March 17, 2026 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant