Monitors a Kopia backup repository for "physical" backups of PostgreSQL databases and restores them regularly within a Kubernetes cluster.
Warning
This is an internal project of BES.au, used in our data and analytics infrastructure, as well as for backup operations and testing purposes. As such no guarantees are made about stability beyond our internal usage.
Generate the CRDs:
cargo run --bin gen-crds > crds.yaml
Apply both the CRDs and the operator:
kubectl apply -f crds.yaml
kubectl apply -f operator.yaml
Make a new namespace:
apiVersion: v1
kind: Namespace
metadata:
name: pgro-exampleCreate a Secret containing the Kopia repository credentials:
apiVersion: v1
kind: Secret
metadata:
namespace: pgro-example
name: kopia-credentials
type: Opaque
stringData:
bucket: example-bucket
region: ap-southeast-2
repositoryPassword: super-secret-repo-password-123
accessKeyId: AKIAIOSFODNN7EXAMPLE
secretAccessKey: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEYCreate a PostgreSQL Physical Replica instance:
apiVersion: pgro.bes.au/v1alpha1
kind: PostgresPhysicalReplica
metadata:
namespace: pgro-example
name: test
spec:
kopiaSecretRef:
name: kopia-credentials
schedule: '* */6 * * *'
snapshotFilter:
tags:
area: postgresThis will restore the latest snapshot matching the filter, create a new PostgreSQL instance with the restored data, and then do that again every 6 hours.
There are two CRDs:
PostgresPhysicalReplica, the main entry pointPostgresPhysicalRestore, managed by the operator, represents a single restore operation and result
The main user-facing resource. Defines a continuously-refreshed replica of a PostgreSQL database restored from Kopia snapshots.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
kopiaSecretRef |
SecretReference |
Yes | — | Reference to a Secret containing Kopia repository credentials (bucket, region, repositoryPassword, accessKeyId, secretAccessKey). |
snapshotFilter |
SnapshotFilter |
No | — | Filter criteria to select which Kopia snapshot to restore. |
schedule |
string |
Yes | — | Cron expression controlling how often new restores are triggered. |
scheduleJitter |
string |
No | "10m" |
Random jitter added to scheduled restores (friendly duration, e.g. "5m", "1h"). |
minimumTtl |
string |
No | — | Don't restore a new snapshot within this duration of the last restore completing. |
switchoverGracePeriod |
string |
No | "5m" |
How long to wait before deleting the old restore after a switchover. |
analyticsUsername |
string |
No | "analytics" |
Username created for analytics connections. |
storageClass |
string |
No | — | Kubernetes StorageClass for the restore PVCs. |
storageSizeOverride |
Quantity |
No | — | Override dynamic sizing with a fixed PVC size. When absent, PVC size is calculated from snapshot size. |
resources |
ResourceRequirements |
No | — | CPU/memory resource requirements for the PostgreSQL pods. |
serviceAnnotations |
map[string]string |
No | — | Annotations applied to the Service. |
podAnnotations |
map[string]string |
No | — | Annotations applied to the PostgreSQL pods. |
affinity |
Affinity |
No | — | Pod scheduling affinity rules. |
tolerations |
[]Toleration |
No | [] |
Pod tolerations. |
readOnly |
bool |
No | true |
Set the restored database to read-only mode. |
postgresExtraConfig |
string |
No | — | Extra lines appended to postgresql.conf (e.g. shared_preload_libraries). |
notifications |
[]NotificationConfig |
No | [] |
Notification targets called on restore events. |
overlayDatabase |
OverlayDatabaseConfig |
No | — | Optional overlay database configuration (FDW-based persistent database via CNPG). |
Using the overlayDatabase requires the CloudNative-PG operator to be installed and configured.
Installing the CNPG cluster-level catalogs is optional but recommended.
The cron expression is parsed using the cronexpr crate. It has two interesting features:
- you can append a timezone (we default to UTC):
20 15 * * * Pacific/Auckland; - you can use
Hin any field to use an arbitrary quantity which is derived from the replica's identity, e.g.H 15 * * *.
Jitter is applied to the scheduled time after the cron expression is evaluated.
The jitter is a random duration between -time/2 and +time/2.
For example, 10m will result in a jitter between -5m and 5m.
When using H in the cron expression, you might want to set the jitter to zero to properly take advantage of the spread-but-stable behaviour.
| Field | Type | Required | Description |
|---|---|---|---|
tags |
map[string]string |
No | Key-value tags that the snapshot must match. |
hostPattern |
string |
No | Glob pattern for filtering snapshot hosts. |
descriptionPattern |
string |
No | Glob pattern for filtering snapshot descriptions. |
A tagged union on the target field. Common fields:
| Field | Type | Required | Description |
|---|---|---|---|
target |
"webhook" | "graphQL" |
Yes | Notification target type. |
url |
string |
Yes | URL to send the notification to. |
headers |
map[string]HeaderValue |
No | HTTP headers. Values can be plain strings or { secretKeyRef: { name, key } }. |
includePassword |
bool |
No | Include the database password in the notification payload. |
Additional fields for target: webhook:
| Field | Type | Default | Description |
|---|---|---|---|
method |
string |
"POST" |
HTTP method. |
Additional fields for target: graphQL:
| Field | Type | Required | Description |
|---|---|---|---|
mutation |
string |
Yes | GraphQL mutation string. |
variablesTemplate |
string |
Yes | Template for the GraphQL variables payload. |
Configures an overlay database backed by a CNPG Cluster that uses Foreign Data Wrappers to import schemas from the restored replica. This can be used to persistently write data in other schemas in the overlay without interfacing between two databases.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
postgresVersion |
uint32 |
No | Resolved from image catalog, or 17 |
PostgreSQL major version for the CNPG cluster. |
imageCatalog |
ImageCatalogRef |
No | ClusterImageCatalog | CNPG image catalog for PG version discovery and image resolution. |
storageSizeOverride |
Quantity |
No | Auto-sized | Override for the overlay PVC size. Auto-sizing: 5Gi + ceil(snapshotSize / 10), ratchets up only. |
storageClass |
string |
No | — | Storage class for the overlay database PVC. |
resources |
ResourceRequirements |
No | — | Resource requirements for the overlay database pods. |
affinity |
Affinity |
No | — | Pod affinity rules for the overlay database. |
tolerations |
[]Toleration |
No | [] |
Tolerations for the overlay database pods. |
serviceAnnotations |
map[string]string |
No | — | Annotations for the overlay database's -rw Service. |
schemaMapping |
map[string]string |
No | All schemas | Schema import mapping. Key = remote schema, Value = local schema in overlay DB. If absent, all user schemas are imported at their original names. |
importGenerated |
bool |
No | false |
Include GENERATED column expressions when importing foreign schemas. Requires that all functions used in generated columns exist on the overlay database. |
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name |
string |
Yes | — | Name of the image catalog resource. |
kind |
string |
No | "ClusterImageCatalog" |
Kind of the image catalog (ClusterImageCatalog or ImageCatalog). |
| Field | Type | Description |
|---|---|---|
phase |
Pending | Restoring | Ready | Failed |
Current phase of the replica. |
currentRestore |
string |
Name of the current PostgresPhysicalRestore resource. |
previousRestore |
string |
Name of the previous restore (pending deletion after switchover). |
serviceName |
string |
Name of the Kubernetes Service pointing to the active restore. |
lastRestoreCompletedAt |
Time |
When the last restore completed. |
nextScheduledRestore |
Time |
When the next scheduled restore will occur. |
latestAvailableSnapshot |
string |
Snapshot ID of the latest available snapshot matching the filter. |
connectionInfo |
ConnectionInfo |
Connection details (host, port, database, username, password secret). |
queuePosition |
uint32 |
Position in the global restore queue. |
notifications |
[]NotificationStatus |
Status of each configured notification target. |
conditions |
[]Condition |
Standard Kubernetes conditions. |
overlayClusterName |
string |
Name of the CNPG Cluster CR for the overlay database. |
overlayFdwRestore |
string |
Name of the restore whose schemas are currently imported via FDW. |
overlayStorageSize |
Quantity |
Current (possibly ratcheted) storage size of the overlay PVC. |
overlayPostgresVersion |
uint32 |
Resolved PG major version used for the overlay cluster. |
Managed by the operator. Each resource represents a single restore operation from a Kopia snapshot. Users should not create these directly. Deleting this resource will drop the restored database and prompt the Replica to create a new Restore immediately.
| Field | Type | Required | Description |
|---|---|---|---|
replica |
LocalObjectReference |
Yes | Reference to the parent PostgresPhysicalReplica. |
snapshot |
string |
Yes | Kopia snapshot ID to restore. |
snapshotSize |
Quantity |
Yes | Size of the snapshot from Kopia metadata. |
storageSize |
Quantity |
Yes | Calculated PVC size (snapshot size × 1.1). |
| Field | Type | Description |
|---|---|---|
phase |
Pending | Restoring | Ready | Switching | Active | Failed |
Current phase of the restore. |
postgresVersion |
string |
Detected PostgreSQL major version from the restored data. |
createdAt |
Time |
When the restore resource was created. |
restoredAt |
Time |
When the restore job completed. |
activatedAt |
Time |
When the service switched to this restore. |
restoreJob |
JobStatus |
Status of the Kubernetes Job performing the restore (name, phase, completedAt). |
pvc |
string |
Name of the PVC holding the restored data. |
deployment |
string |
Name of the Deployment running PostgreSQL on the restored data. |
credentialsSecret |
string |
Shared credentials secret (owned by parent replica). |
conditions |
[]Condition |
Standard Kubernetes conditions. |