You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a Perses instance is configured with file-based storage (spec.config.database.file) and replicas > 1, dashboards, datasources, and global datasources are only pushed to a single pod instead of all replicas. This is because the operator syncs resources through the Kubernetes ClusterIP Service, which load-balances to one random pod. Since each pod has its own independent file-based storage (separate PVCs via the StatefulSet, or separate emptyDir volumes), the other pods never receive the resources.
This means users hitting the Perses UI through the Service will get inconsistent results depending on which pod they land on. Some requests return the dashboard, others return 404.
The same issue affects the delete path: deleting a CR only removes the resource from whichever pod the Service routes to.
Steps to Reproduce
Create a Perses instance with file-based storage and 2+ replicas:
I believe SQL-backed instances are not affected since all pods share state through the external database, but I haven't tested it.
This affects PersesDashboard, PersesDatasource, and PersesGlobalDatasource equally.
The operator has the correct RBAC to list pods but does not currently use it in the sync path.
This is distinct from New Perses instances do not receive existing dashboards/datasources #286, which addressed multiple separate Perses instances (separate CRs) not receiving existing resources. That was a missing watch/trigger problem, fixed by the PersesAvailabilityPredicate. This issue is about multiple replicas within a single Perses instance: The reconciliation triggers correctly, but the HTTP push only reaches one pod because it goes through the Service.
Is there an existing issue for this?
What happened?
Description
When a Perses instance is configured with file-based storage (
spec.config.database.file) andreplicas > 1, dashboards, datasources, and global datasources are only pushed to a single pod instead of all replicas. This is because the operator syncs resources through the Kubernetes ClusterIP Service, which load-balances to one random pod. Since each pod has its own independent file-based storage (separate PVCs via the StatefulSet, or separate emptyDir volumes), the other pods never receive the resources.This means users hitting the Perses UI through the Service will get inconsistent results depending on which pod they land on. Some requests return the dashboard, others return 404.
The same issue affects the delete path: deleting a CR only removes the resource from whichever pod the Service routes to.
Steps to Reproduce
Create a Perses instance with file-based storage and 2+ replicas:
Wait for both pods to be ready.
Create a PersesDashboard targeting that instance:
Port-forward to each pod individually and query the dashboards API:
Expected Result
Both pods return the dashboard.
Actual Result
Only one pod has the dashboard; the other returns an empty list (or errors because the project doesn't exist on that pod either).
Perses Operator Version
Kubernetes Version
Kubernetes Cluster Type
kind
How did you deploy Perses-Operator?
yaml manifests
Manifests
see abovePerses-operator log output
Anything else?
PersesAvailabilityPredicate. This issue is about multiple replicas within a single Perses instance: The reconciliation triggers correctly, but the HTTP push only reaches one pod because it goes through the Service.