[BUGFIX] Sync resources to all pods instead of service by rickardsjp · Pull Request #412 · perses/perses-operator

rickardsjp · 2026-06-01T09:17:53Z

Description

For file-based storage with multiple replicas, push resources to each pod individually rather than through the ClusterIP Service. The operator now lists ready pods by label selector, creates an HTTP client per pod IP, and syncs to all of them with best-effort error aggregation.

Closes: #409

Type of change

FEATURE (non-breaking change which adds functionality)
ENHANCEMENT (non-breaking change which improves existing functionality)
BUGFIX (non-breaking change which fixes an issue)
BREAKINGCHANGE (fix or feature that would cause existing functionality to not work as expected)
DOC (documentation only)
IGNORE (tooling, build system, CI, etc.)

Verification

Unit tests added/updated
Integration tests added/updated
E2E tests added/updated
Manual testing performed

Checklist

Pull request has a descriptive title and context useful to a reviewer
Code follows project conventions and passes linting
All commits have DCO signoffs

For file-based storage with multiple replicas, the operator must push resources to each pod individually rather than through the Service. This adds the factory method that lists ready pods and creates a client per pod IP. Signed-off-by: Jeremy Rickards <jeremy.rickards@sap.com>

Refactor dashboard, datasource, and globaldatasource controllers to use CreateClientsForAllPods, pushing resources to each ready pod individually rather than through the round-robin Service. Signed-off-by: Jeremy Rickards <jeremy.rickards@sap.com>

Signed-off-by: Jeremy Rickards <jeremy.rickards@sap.com>

Copilot

Pull request overview

This PR fixes multi-replica file-based Perses deployments by syncing dashboards/datasources/globaldatasources to each ready Perses pod directly, rather than sending requests through the ClusterIP Service (which load-balances to a single replica).

Changes:

Add a CreateClientsForAllPods API in the Perses client factory to list ready pods and create one REST client per pod endpoint.
Update Dashboard/Datasource/GlobalDatasource controllers to sync/delete resources across all returned clients with best-effort error aggregation.
Expand controller RBAC markers to allow listing pods.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
internal/perses/common/perses_client_factory.go	Adds multi-pod client creation by listing ready pods and building per-pod REST clients.
controllers/dashboards/persesdashboard_controller.go	Adds pod list/watch RBAC marker for dashboard controller.
controllers/dashboards/dashboard_controller.go	Syncs/deletes dashboards across all pod clients instead of a single service client.
controllers/datasources/persesdatasource_controller.go	Adds pod list/watch RBAC marker for datasource controller.
controllers/datasources/datasource_controller.go	Syncs/deletes datasources across all pod clients instead of a single service client.
controllers/globaldatasources/persesglobaldatasource_controller.go	Adds pod list/watch RBAC marker for global datasource controller.
controllers/globaldatasources/globaldatasource_controller.go	Syncs/deletes global datasources across all pod clients instead of a single service client.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+		if !isPodReady(pod) {
+			continue
+		}
+		urlStr := fmt.Sprintf("%s://%s:%d%s", httpProtocol, pod.Status.PodIP, containerPort, perses.Spec.Config.APIPrefix)


 import (
 	"context"
 	"flag"
 	"fmt"
 	"os"

+	corev1 "k8s.io/api/core/v1"
 	"sigs.k8s.io/controller-runtime/pkg/client"


+	if len(errs) > 0 {
+		return subreconciler.RequeueWithErrorAndReason(errors.Join(errs...), persescommon.ReasonBackendError)
+	}


 	if persescommon.HasSecretConfig(datasource.Spec.Client) {
-		_, reason, err := r.syncPersesSecret(ctx, persesClient, datasource)
+		_, _, err := r.syncPersesSecret(ctx, persesClient, datasource)
 		if err != nil {
 			dlog.WithError(err).Errorf("Failed to create datasource secret: %s", datasource.Name)
-			return subreconciler.RequeueWithErrorAndReason(err, reason)
+			return err
 		}


+	if len(errs) > 0 {
+		return subreconciler.RequeueWithErrorAndReason(errors.Join(errs...), persescommon.ReasonBackendError)
 	}


 	if persescommon.HasSecretConfig(globaldatasource.Spec.Client) {
-		_, reason, err := r.syncPersesGlobalSecret(ctx, persesClient, globaldatasource)
+		_, _, err := r.syncPersesGlobalSecret(ctx, persesClient, globaldatasource)
 		if err != nil {
 			gdlog.WithError(err).Errorf("Failed to create globaldatasource secret: %s", globaldatasource.Name)
-			return subreconciler.RequeueWithErrorAndReason(err, reason)
+			return err
 		}


+func (f *PersesClientFactoryWithConfig) CreateClientsForAllPods(ctx context.Context, k8sClient client.Reader, perses persesv1alpha2.Perses) ([]v1.ClientInterface, error) {
+	if perses.Spec.Config.Database.SQL != nil {
+		c, err := f.CreateClient(ctx, k8sClient, perses)
+		if err != nil {
+			return nil, err


slashpai · 2026-06-04T04:17:52Z

How would this work with TLS enabled? If Perses TLS certs are issued for the Service DNS, direct pod ip calls might fail validation IIUC.

Also if we scale the replicas till a reconciliation is triggered new replica won't get the updates right?

rickardsjp added 3 commits May 22, 2026 14:24

[BUGFIX] add pod list/watch RBAC for resource controllers

2048c75

Signed-off-by: Jeremy Rickards <jeremy.rickards@sap.com>

ibakshay requested review from Copilot and slashpai June 2, 2026 11:53

Copilot started reviewing on behalf of ibakshay June 2, 2026 11:53 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

slashpai mentioned this pull request Jun 4, 2026

[BUGFIX] Add e2e test for file-storage multi-replica dashboard sync #411

Open

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUGFIX] Sync resources to all pods instead of service#412

[BUGFIX] Sync resources to all pods instead of service#412
rickardsjp wants to merge 3 commits into
perses:mainfrom
rickardsjp:fix-multiple-replicas

rickardsjp commented Jun 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

slashpai commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rickardsjp commented Jun 1, 2026

Description

Type of change

Verification

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

slashpai commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants