K8SPS-686: fail backup if restore is running#1284
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a Kubernetes Lease-based locking mechanism to prevent concurrency between restores and backups for the same PerconaServerMySQL cluster, ensuring backups fail when a restore is in progress and restores wait while backups are running.
Changes:
- Add
pkg/k8shelpers to acquire/release/query a “restore lock” usingcoordination.k8s.io/v1 Lease. - Update restore and backup controllers to enforce mutual exclusion (restore blocks on running backups; backup fails when restore lease is active).
- Extend unit test coverage and update RBAC manifests to allow Lease access.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/k8s/lease.go | New Lease helper utilities for restore locking |
| pkg/k8s/lease_test.go | Unit tests for Lease helper behavior |
| pkg/controller/psrestore/controller.go | Acquire/Release restore lease; block restore on running backups |
| pkg/controller/psrestore/controller_test.go | Add test case for “running backup blocks restore” |
| pkg/controller/psbackup/controller.go | Fail backup creation when an active restore lease exists |
| pkg/controller/psbackup/controller_test.go | Add test case for “active restore lease blocks backup” |
| deploy/rbac.yaml | Grant operator Role permissions for Leases |
| deploy/cw-rbac.yaml | Grant CW Role permissions for Leases |
| deploy/cw-bundle.yaml | Bundle update for Lease RBAC |
| deploy/bundle.yaml | Bundle update for Lease RBAC |
| config/rbac/role.yaml | Controller role updated for Lease permissions |
| config/rbac/cluster/role.yaml | Cluster role updated for Lease permissions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/k8s/lease.go
Outdated
| return errors.Wrap(err, "create lease") | ||
| } | ||
|
|
||
| result = lease.DeepCopy() | ||
| return nil |
There was a problem hiding this comment.
AcquireRestoreLease can race when multiple restores attempt to create the Lease at the same time: if cl.Create returns an AlreadyExists error, RetryOnConflict will not retry (it only retries conflicts), so the reconciler will fail even though it could recover by re-getting/updating the existing Lease. Consider handling apierrors.IsAlreadyExists(err) by re-fetching and continuing, or switching to controllerutil.CreateOrUpdate / a Patch-based create-or-update flow.
| return errors.Wrap(err, "create lease") | |
| } | |
| result = lease.DeepCopy() | |
| return nil | |
| if !k8serrors.IsAlreadyExists(err) { | |
| return errors.Wrap(err, "create lease") | |
| } | |
| lease = &coordv1.Lease{} | |
| if err := cl.Get(ctx, nn, lease); err != nil { | |
| return errors.Wrap(err, "get existing lease after create") | |
| } | |
| } else { | |
| result = lease.DeepCopy() | |
| return nil | |
| } |
| lease, err := k8s.AcquireRestoreLease(ctx, r.Client, cr) | ||
| if err != nil { | ||
| return ctrl.Result{}, errors.Wrap(err, "acquire restore lease") | ||
| } | ||
| if lease.Spec.HolderIdentity != nil && *lease.Spec.HolderIdentity != cr.Name { | ||
| status.State = apiv1.RestoreNew | ||
| status.StateDesc = fmt.Sprintf("PerconaServerMySQLRestore %s is already running", *lease.Spec.HolderIdentity) | ||
| log.Info("PerconaServerMySQLRestore is already running", "restore", *lease.Spec.HolderIdentity) | ||
| return ctrl.Result{RequeueAfter: 5 * time.Second}, nil | ||
| } |
There was a problem hiding this comment.
The restore lock relies on LeaseDurationSeconds (30s) and only updates RenewTime when the restore reconciles. During long-running restores, there may be long stretches without reconcile events (Job status often changes only at start/finish), so the lease can expire while the restore is still Running and backups would no longer be blocked. To make the lock effective, either requeue periodically while the restore is Starting/Running to renew the lease (e.g., RequeueAfter < duration), and/or use a substantially longer duration.
| func (r *PerconaServerMySQLRestoreReconciler) getRunningBackup(ctx context.Context, cr *apiv1.PerconaServerMySQLRestore) (*apiv1.PerconaServerMySQLBackup, error) { | ||
| backups := new(apiv1.PerconaServerMySQLBackupList) | ||
| if err := r.List(ctx, backups, client.InNamespace(cr.Namespace)); err != nil { | ||
| return nil, errors.Wrap(err, "list backups") | ||
| } | ||
|
|
||
| for _, b := range backups.Items { | ||
| if b.Spec.ClusterName != cr.Spec.ClusterName { | ||
| continue | ||
| } | ||
|
|
||
| switch b.Status.State { | ||
| case apiv1.BackupStarting, apiv1.BackupRunning: | ||
| return &b, nil | ||
| } |
There was a problem hiding this comment.
getRunningBackup currently lists all backups in the namespace and then filters in-memory. This is O(N) per reconcile and can become expensive as backup history grows. A field indexer for PerconaServerMySQLBackup spec.clusterName is already set up (see cmd/manager/main.go), so you can use client.MatchingFields{"spec.clusterName": cr.Spec.ClusterName} along with client.InNamespace(...) to query only backups for the target cluster.
pkg/k8s/lease.go
Outdated
| Spec: coordv1.LeaseSpec{ | ||
| HolderIdentity: &cr.Name, | ||
| LeaseDurationSeconds: ptr.To(restoreLeaseDurationSeconds), | ||
| AcquireTime: &metav1.MicroTime{Time: now}, | ||
| RenewTime: &metav1.MicroTime{Time: now}, | ||
| }, |
There was a problem hiding this comment.
LeaseSpec.HolderIdentity is being set using &cr.Name. Using ptr.To(cr.Name) (or an explicit local variable) avoids taking the address of a field on another object and makes the intent clearer (and prevents accidental aliasing if the source struct is ever reused/mutated).
| if err := k8s.ReleaseRestoreLease(ctx, r.Client, cr); err != nil { | ||
| log.Error(err, "failed to release restore lease") | ||
| } |
There was a problem hiding this comment.
doesn't this mean we'll release the lease even when restore doesn't finish in a single reconciliation?
There was a problem hiding this comment.
ReleaseRestoreLease has status check to release only for finished restores:
percona-server-mysql-operator/pkg/k8s/lease.go
Lines 86 to 89 in 84f2a5d
| } | ||
|
|
||
| defer func() { | ||
| if err := k8s.ReleaseRestoreLease(ctx, r.Client, cr); err != nil { |
There was a problem hiding this comment.
we also need a finalizer to release if restore is deleted in running state
There was a problem hiding this comment.
I don’t think we need a finalizer. It could cause problems. The lease remains active for 30 seconds. If the restore is deleted while in the running state, another restore will be started after that period.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return ctrl.Result{}, nil | ||
| } | ||
|
|
||
| lease, err := k8s.GetLease(ctx, r.Client, cr.Namespace, cr.Spec.ClusterName) |
There was a problem hiding this comment.
Are the parameters correct? I think name and namespace are interchanged. Also shouldn't we pass naming.RestoreLeaseName ?
| } | ||
|
|
||
| lease, err := k8s.GetLease(ctx, r.Client, cr.Namespace, cr.Spec.ClusterName) | ||
| if err != nil { |
There was a problem hiding this comment.
Should we ignore NotFound error? Under normal cases I don't think restore lease will be present right?
| for _, b := range backups.Items { | ||
| if b.Spec.ClusterName != cr.Spec.ClusterName { | ||
| continue | ||
| } |
There was a problem hiding this comment.
We have field indexer on clusterName
| log.Info("PerconaServerMySQLRestore is already running", "restore", restore.Name) | ||
| return ctrl.Result{RequeueAfter: 5 * time.Second}, nil | ||
| } | ||
| backup, err := r.getRunningBackup(ctx, cr) |
There was a problem hiding this comment.
I think we might still have a TOCTOU race: when you start a backup and restore at the same time, what happens if the restore lease is acquired between the BackupNew -> BackupStarting transition window?
Consider the following sequence:
T1: backup starts -> no restore lease -> proceeds
T2: restore starts, acquires lease, checks backup is still BackupNew -> proceeds
T3: we have backup and restore running at same time
There was a problem hiding this comment.
This might become more evident if the informer caches are not updated fast enough
|
|
||
| defer func() { | ||
| switch cr.Status.State { | ||
| case apiv1.RestoreNew, apiv1.RestoreStarting, apiv1.RestoreRunning: |
There was a problem hiding this comment.
Should we be checking status.Status instead? That's the copy we modify during the course of that reconciliation, cr.Status is what is observed before reconciling
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| lease, err := k8s.GetLease(ctx, r.Client, cr.Namespace, cr.Spec.ClusterName) | ||
| if err != nil { | ||
| return rr, errors.Wrap(err, "get restore lease") | ||
| } | ||
| if k8s.IsLeaseActive(lease) { | ||
| status.State = apiv1.BackupError | ||
| status.StateDesc = fmt.Sprintf("backup cannot run while restore %s is in progress", *lease.Spec.HolderIdentity) | ||
| return ctrl.Result{}, nil | ||
| } |
| obj: []client.Object{ | ||
| &coordv1.Lease{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: "restore-lock-" + cr.Spec.ClusterName, |
pkg/k8s/lease.go
Outdated
| "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" | ||
| ) | ||
|
|
||
| const leaseDurationSeconds int32 = 30 |
There was a problem hiding this comment.
Do we actually need renew logic for our use-case? IIUC, we just want to use lease like a global lock. When a restore start -> acquire it. When it is done -> release it. Why does it need to periodically send a heart beat?
The stale checking logic already covers the case when a restore completed but failed to release the lock
There was a problem hiding this comment.
I agree. With these changes we don't need the renew logic: 22ab993
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for _, backup := range backupList.Items { | ||
| switch backup.Status.State { | ||
| case apiv1.BackupStarting, apiv1.BackupRunning: | ||
| return &backup, nil |
| if err := k8s.AcquireLease(ctx, r.Client, naming.RestoreLeaseName(cr.Spec.ClusterName), cr.Name, cr.Namespace, checkStale); err != nil { | ||
| if errors.Is(err, k8s.ErrLeaseAlreadyHeld) || k8serrors.IsAlreadyExists(err) || k8serrors.IsConflict(err) { | ||
| status.State = apiv1.RestoreNew | ||
| status.StateDesc = fmt.Sprintf("PerconaServerMySQLRestore %s is already running", leaseHolder) | ||
| log.Info("PerconaServerMySQLRestore is already running", "restore", leaseHolder) |
commit: 355d80f |
https://perconadev.atlassian.net/browse/K8SPS-686
DESCRIPTION
This PR prevents backup and restore from running at the same time for the same cluster.
Operator creates a lease for an active restore and blocks backups while that lease is held. Also it makes restores wait if a backup is still running.
CHECKLIST
Jira
Needs Doc) and QA (Needs QA)?Tests
Config/Logging/Testability