K8SPS-683: Perform switchover during smart update by egegunes · Pull Request #1282 · percona/percona-server-mysql-operator

egegunes · 2026-04-07T06:19:51Z

CHANGE DESCRIPTION

Problem:
Short explanation of the problem.

Cause:
Short explanation of the root cause of the issue if applicable.

Solution:
Short explanation of the solution we are providing with this PR.

CHECKLIST

Jira

Is the Jira ticket created and referenced properly?
Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

Is an E2E test/test case added for the new feature/change?
Are unit tests added where appropriate?

Config/Logging/Testability

Are all needed new/changed options added to default YAML files?
Are all needed new/changed options added to the Helm Chart?
Did we add proper logging messages for operator actions?
Did we ensure compatibility with the previous version or cluster upgrade process?
Does the change support oldest and newest supported PS version?
Does the change support oldest and newest supported Kubernetes version?

Copilot

Pull request overview

This PR updates the SmartUpdate flow for PerconaServerMySQL StatefulSets to perform an explicit primary switchover before updating the current primary pod, aiming to reduce disruption during upgrades.

Changes:

Refactors SmartUpdate to update secondaries first, then perform switchover, then update the former primary.
Adds switchover helpers for async (Orchestrator) and GR (mysqlsh) clusters, including a wait-for-primary-change step.
Adds unit tests covering primary-candidate selection, revision-change detection, and switchover helpers.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
pkg/controller/ps/upgrade.go	Implements switchover-driven SmartUpdate flow and replaces pod deletion waiting logic.
pkg/controller/ps/upgrade_test.go	Adds focused unit tests for candidate selection, stsChanged, and switchover helpers (GR/async).

Comments suppressed due to low confidence (1)

pkg/controller/ps/upgrade.go:281

deletePodAndWait returns an error if cli.Delete returns NotFound. During reconciliation (or after a previous attempt), the pod may already be deleted, and treating NotFound as fatal will unnecessarily fail SmartUpdate. Consider ignoring NotFound errors on delete (e.g., client.IgnoreNotFound) and continuing to the wait loop.

func deletePodAndWait(ctx context.Context, cli client.Client, pod *corev1.Pod, sts *appsv1.StatefulSet) error {
	err := cli.Delete(ctx, pod)
	if err != nil {
		return err
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/controller/ps/upgrade.go

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-07T12:08:34Z

pkg/controller/ps/upgrade.go

+			return err
+		}
+
+		if !strings.HasPrefix(primHost, target.Name) {


switchOverAndWait checks the new primary via strings.HasPrefix(primHost, target.Name). Pod names are of the form <cluster>-mysql-<idx> (e.g. ...-1, ...-10), so HasPrefix can match the wrong pod (...-1 is a prefix of ...-10). Parse the hostname (e.g. split on .) and compare the pod name segment for equality with target.Name instead.

Suggested change

if !strings.HasPrefix(primHost, target.Name) {

hostParts := strings.SplitN(primHost, ".", 2)

if len(hostParts) == 0 || hostParts[0] != target.Name {

Copilot · 2026-04-07T12:08:34Z

pkg/controller/ps/upgrade.go

+	orcPod, err := getReadyOrcPod(ctx, r.Client, cr)
+	if err != nil {
+		return errors.Wrap(err, "get ready orchestrator pod")
+	}


switchOverAsync always requires a ready Orchestrator pod. Async clusters can have orchestrator disabled when spec.unsafe.orchestrator is set (see PerconaServerMySQL.OrchestratorEnabled()), and in that configuration SmartUpdate will now fail with "get ready orchestrator pod" instead of progressing. Consider guarding this path (e.g. skip switchover / fall back to deleting the primary) when orchestrator isn’t enabled/available.

if you disable orchestrator, our CRD validations won't let you use SmartUpdate

Copilot · 2026-04-07T12:08:35Z

pkg/controller/ps/upgrade.go

+func (r *PerconaServerMySQLReconciler) switchOverAndWait(
+	ctx context.Context,
+	cr *apiv1.PerconaServerMySQL,
+	primary *corev1.Pod, target *corev1.Pod,
+) error {
+	log := logf.FromContext(ctx)
+
+	log.Info("switchover", "current", primary.Name, "target", target.Name)
+


New switchover orchestration in switchOverAndWait (including the retry loop that polls getPrimaryHost and the GR primary-label reconciliation) is not covered by tests; current tests only cover switchOverGR/switchOverAsync in isolation. Adding a focused unit test for switchOverAndWait would help prevent regressions in the wait/verification logic.

JNKPercona · 2026-04-08T12:30:39Z

Test Name	Result	Time
async-ignore-annotations-8-4	passed	00:06:45
async-global-metadata-8-4	passed	00:14:45
async-upgrade-8-0	passed	00:12:58
async-upgrade-8-4	passed	00:13:07
auto-config-8-4	passed	00:26:27
config-8-4	passed	00:22:11
config-router-8-0	passed	00:07:30
config-router-8-4	passed	00:07:44
demand-backup-minio-8-0	passed	00:19:21
demand-backup-minio-8-4	passed	00:20:30
demand-backup-cloud-8-4	passed	00:22:56
demand-backup-retry-8-4	passed	00:17:13
demand-backup-incremental-8-0	passed	00:32:34
demand-backup-incremental-8-4	passed	00:34:57
async-data-at-rest-encryption-8-0	passed	00:13:27
async-data-at-rest-encryption-8-4	passed	00:13:43
gr-global-metadata-8-4	passed	00:14:27
gr-data-at-rest-encryption-8-0	passed	00:14:26
gr-data-at-rest-encryption-8-4	passed	00:14:26
gr-demand-backup-minio-8-4	passed	00:12:44
gr-demand-backup-cloud-8-4	passed	00:22:25
gr-demand-backup-haproxy-8-4	passed	00:10:10
gr-demand-backup-incremental-8-0	passed	00:35:32
gr-demand-backup-incremental-8-4	passed	00:35:46
gr-finalizer-8-4	passed	00:06:21
gr-haproxy-8-0	passed	00:05:06
gr-haproxy-8-4	passed	00:04:20
gr-ignore-annotations-8-4	passed	00:05:00
gr-init-deploy-8-0	passed	00:10:16
gr-init-deploy-8-4	passed	00:10:13
gr-one-pod-8-4	passed	00:05:52
gr-recreate-8-4	passed	00:16:51
gr-scaling-8-4	passed	00:08:20
gr-scheduled-backup-8-4	passed	00:16:29
gr-scheduled-backup-incremental-8-4	passed	00:23:17
gr-security-context-8-4	passed	00:10:07
gr-self-healing-8-4	passed	00:21:42
gr-tls-cert-manager-8-4	failure	00:07:20
gr-users-8-4	passed	00:05:27
gr-upgrade-8-0	failure	00:13:08
gr-upgrade-8-4	passed	00:09:47
haproxy-8-0	passed	00:08:20
haproxy-8-4	passed	00:08:17
init-deploy-8-0	passed	00:06:51
init-deploy-8-4	passed	00:05:42
limits-8-4	passed	00:07:26
monitoring-8-4	passed	00:14:35
one-pod-8-0	passed	00:05:50
one-pod-8-4	passed	00:05:34
operator-self-healing-8-4	passed	00:11:51
pvc-resize-8-4	passed	00:05:52
recreate-8-4	failure	00:00:43
scaling-8-4	passed	00:11:20
scheduled-backup-8-0	failure	00:02:14
scheduled-backup-8-4	failure	00:00:49
scheduled-backup-incremental-8-0	failure	00:00:50
scheduled-backup-incremental-8-4	failure	00:00:51
service-per-pod-8-4	passed	00:06:44
sidecars-8-4	passed	00:04:52
smart-update-8-4	failure	00:00:49
storage-8-4	failure	00:00:49
telemetry-8-4	failure	00:00:49
tls-cert-manager-8-4	failure	00:00:51
users-8-0	failure	00:00:50
users-8-4	passed	00:11:37
version-service-8-4	passed	00:19:56

Summary	Value
Tests Run	66/66
Job Duration	06:19:58
Total Test Time	12:50:37

commit: a578dcc
image: perconalab/percona-server-mysql-operator:PR-1282-a578dcca

K8SPS-683: Perform switchover during smart update

86a2f6b

egegunes added this to the v1.1.0 milestone Apr 7, 2026

Copilot AI review requested due to automatic review settings April 7, 2026 06:19

pull-request-size bot added the size/XL 500-999 lines label Apr 7, 2026

Merge branch 'main' into K8SPS-683

6db255f

Copilot started reviewing on behalf of egegunes April 7, 2026 06:20 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

pkg/controller/ps/upgrade.go Show resolved Hide resolved

pkg/controller/ps/upgrade.go Outdated Show resolved Hide resolved

pkg/controller/ps/upgrade.go Outdated Show resolved Hide resolved

egegunes added 3 commits April 7, 2026 11:23

address review comments

4ca49f7

fix rolling version upgrades

f4f66d2

Merge branch 'main' into K8SPS-683

b5a78d9

Copilot AI review requested due to automatic review settings April 7, 2026 12:02

Copilot started reviewing on behalf of egegunes April 7, 2026 12:03 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

address review comments

a578dcc

hors removed this from the v1.1.0 milestone Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K8SPS-683: Perform switchover during smart update#1282

K8SPS-683: Perform switchover during smart update#1282
egegunes wants to merge 6 commits intomainfrom
K8SPS-683

egegunes commented Apr 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

egegunes Apr 8, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

JNKPercona commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	if !strings.HasPrefix(primHost, target.Name) {
	hostParts := strings.SplitN(primHost, ".", 2)
	if len(hostParts) == 0 \|\| hostParts[0] != target.Name {

Conversation

egegunes commented Apr 7, 2026

CHANGE DESCRIPTION

CHECKLIST

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

egegunes Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

JNKPercona commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants