Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
b27ea85
docs: add enterprise disaster recovery guidance [EDU-789]
llewellyn-sl Apr 7, 2026
cb4a8ed
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
9545e39
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
ed19fe9
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
b1df8b7
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
7ce472b
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
655f3ad
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
9f6fd77
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
90805ac
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
3b2ccdd
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
2d6065a
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
f6d9f92
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
54c6581
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
102eb43
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
8859eb4
EDU-789: add platform disaster recovery docs
llewellyn-sl Apr 7, 2026
0e80304
Merge branch 'master' into EDU-789-docs-draft
justinegeffen Apr 16, 2026
c4fd00f
Merge branch 'master' into EDU-789-docs-draft
justinegeffen Apr 20, 2026
f0e55f1
Merge branch 'master' into EDU-789-docs-draft
justinegeffen Apr 21, 2026
492b814
Merge branch 'master' into EDU-789-docs-draft
justinegeffen Apr 30, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion platform-enterprise_docs/enterprise-sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@
"items": [
"enterprise/platform-helm",
"enterprise/platform-kubernetes",
"enterprise/platform-docker-compose"
"enterprise/platform-docker-compose",
"enterprise/disaster-recovery"
]
},
{
Expand Down
91 changes: 91 additions & 0 deletions platform-enterprise_docs/enterprise/disaster-recovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
title: "Platform disaster recovery"
description: Plan backup, restore, and recovery steps for Seqera Platform Enterprise deployments
date created: "2026-04-07"
tags: [installation, deployment, disaster recovery, backup, restore]
---

Use this guide to define a disaster recovery (DR) plan for Seqera Platform Enterprise before you need to restore service after an infrastructure loss or a region-level incident.

Seqera Platform does not create a DR plan for you. Your recovery procedure depends on the infrastructure that hosts Platform, your database and Redis services, your container registry access, and the backup capabilities offered by your cloud provider or platform team.

## What to protect

Back up and document the parts of your deployment that you will need to rebuild Platform:

- The Platform SQL database and its restore procedure.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open – Do you also need back-ups for Redis?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just use for cache but would be good to confirm

- Your Platform configuration, including `tower.env`, `tower.yml`, Helm values, Kubernetes manifests, or `docker-compose.yml`.
- Your `TOWER_CRYPTO_SECRETKEY` value and any rotation-related keys. Existing encrypted secrets in the Platform database cannot be decrypted without the correct key material.
- TLS certificates, identity provider settings, registry credentials, and any other secrets required to start Platform.
- The storage locations and infrastructure dependencies referenced by your Platform deployment, such as load balancers, DNS records, persistent volumes, and mirrored container images.

:::warning
Back up your Platform database before changing the crypto secret key or running key rotation. For more information, see [Configuration overview](./configuration/overview#secret-key-rotation).
:::

## Define recovery targets

Document the following targets with your operations team:

- Recovery point objective (RPO): how much recent Platform state you can afford to lose.
- Recovery time objective (RTO): how long Platform can remain unavailable.
- Recovery owner: who can restore the database, recreate infrastructure, and validate the application.

Your deployment model directly affects these targets:

- Kubernetes and Helm deployments can be rebuilt on new infrastructure more easily, especially when Platform runs with external managed database and Redis services.
- Docker Compose deployments are single-instance by design. Restoring them normally requires application downtime while the host, configuration, and backing services are rebuilt.

## Recommended backup strategy

At minimum, maintain:

1. Regular database backups or snapshots for the SQL database used by Platform.
2. Version-controlled copies of your deployment manifests and configuration overrides.
3. A secure copy of the active crypto secret key and any required supporting secrets.
4. A written restore runbook that includes DNS, ingress, load balancer, and certificate steps.

For production environments, use the backup and replication features provided by your infrastructure:

- Managed SQL backups, snapshots, and cross-region replicas where required by your RPO and RTO.
- Backups for any persistent volumes or host-attached storage used by your deployment.
- Registry mirroring for Platform images if your environment cannot rely on direct access to `cr.seqera.io` during recovery.

## Recovery workflow

### Kubernetes or Helm deployments

1. Recreate or fail over the Kubernetes cluster and its supporting infrastructure.
2. Restore access to the SQL database, Redis service, secrets, ingress, and DNS records.
3. Reapply your Helm values or Kubernetes manifests.
4. Restore the SQL database from the selected backup or snapshot.
5. Confirm that Platform starts with the same crypto secret key used to encrypt the existing database contents.
6. Validate login, workspace access, and workflow launch behavior.

### Docker Compose deployments

1. Provision a replacement host or recover the existing host.
2. Restore `tower.env`, `tower.yml`, `docker-compose.yml`, certificates, and secret material.
3. Restore or recreate the external SQL database and Redis service used by Platform.
4. Start Platform with `docker compose up` and allow migrations and startup checks to finish.
5. Validate login, workspace access, and workflow launch behavior before switching traffic back.

## Validation checklist

Test your DR plan on a schedule that matches your organization's risk requirements. During each exercise, confirm that you can:

- Restore the database from a recent backup.
- Start Platform with the correct crypto secret key and configuration.
- Reach the frontend through the expected DNS and TLS path.
- Log in and access organizations, workspaces, and compute environments.
- Launch a small workflow to verify end-to-end operation.

The [Test deployment](./testing) guide provides a simple post-recovery smoke test you can adapt for DR exercises.

## Related guides

- [Platform installation overview](./install-platform)
- [Platform: Helm](./platform-helm)
- [Platform: Kubernetes](./platform-kubernetes)
- [Platform: Docker Compose](./platform-docker-compose)
- [Test deployment](./testing)
2 changes: 2 additions & 0 deletions platform-enterprise_docs/enterprise/install-platform.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ Seqera Platform Enterprise can be deployed using Docker Compose, Kubernetes, or

See each deployment guide for detailed requirements.

For backup, restore, and recovery planning, see [Platform disaster recovery](./disaster-recovery).

## Prerequisites

:::info
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,7 @@ Seqera Platform offers a service that optimizes pipeline resource requests. Refe
:::note
Studios is available from Seqera Platform v24.1. If you experience any problems during the deployment process please contact your account executive. Studios in Enterprise is not installed by default.
:::

## Disaster recovery planning

Docker Compose deployments are single-instance by design, so recovery normally requires service downtime while you restore the host, configuration, and backing services. For backup, restore, and validation guidance, see [Platform disaster recovery](./disaster-recovery).
4 changes: 4 additions & 0 deletions platform-enterprise_docs/enterprise/platform-helm.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,10 @@ helm upgrade my-release oci://public.cr.seqera.io/charts/platform \
--values my-values.yaml
```

## Disaster recovery planning

Define your backup, restore, and validation procedure before promoting a Helm deployment to production. For DR guidance, including database backups, crypto key handling, and post-restore checks, see [Platform disaster recovery](./disaster-recovery).

## Uninstalling the Helm chart

To uninstall the Seqera Platform Enterprise Helm chart, run the following command, replacing `my-release` and `my-namespace` with your release name and namespace:
Expand Down
2 changes: 2 additions & 0 deletions platform-enterprise_docs/enterprise/platform-kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,8 @@ To configure Seqera Enterprise for high availability, note that:
- The `cron` service may only have a single instance
- The `groundswell` service may only have a single instance

For backup, restore, and validation planning, see [Platform disaster recovery](./disaster-recovery).

[aws-configure-ingress]: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.2/guide/ingress/annotations/
[azure-configure-ingress]: https://docs.microsoft.com/en-us/azure/application-gateway/ingress-controller-annotations
[google-configure-ingress]: https://cloud.google.com/kubernetes-engine/docs/concepts/ingress
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@
"items": [
"enterprise/platform-helm",
"enterprise/platform-kubernetes",
"enterprise/platform-docker-compose"
"enterprise/platform-docker-compose",
"enterprise/disaster-recovery"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
title: "Platform disaster recovery"
description: Plan backup, restore, and recovery steps for Seqera Platform Enterprise deployments
date created: "2026-04-07"
tags: [installation, deployment, disaster recovery, backup, restore]
---

Use this guide to define a disaster recovery (DR) plan for Seqera Platform Enterprise before you need to restore service after an infrastructure loss or a region-level incident.

Seqera Platform does not create a DR plan for you. Your recovery procedure depends on the infrastructure that hosts Platform, your database and Redis services, your container registry access, and the backup capabilities offered by your cloud provider or platform team.

## What to protect

Back up and document the parts of your deployment that you will need to rebuild Platform:

- The Platform SQL database and its restore procedure.
- Your Platform configuration, including `tower.env`, `tower.yml`, Helm values, Kubernetes manifests, or `docker-compose.yml`.
- Your `TOWER_CRYPTO_SECRETKEY` value and any rotation-related keys. Existing encrypted secrets in the Platform database cannot be decrypted without the correct key material.
- TLS certificates, identity provider settings, registry credentials, and any other secrets required to start Platform.
- The storage locations and infrastructure dependencies referenced by your Platform deployment, such as load balancers, DNS records, persistent volumes, and mirrored container images.

:::warning
Back up your Platform database before changing the crypto secret key or running key rotation. For more information, see [Configuration overview](./configuration/overview#secret-key-rotation).
:::

## Define recovery targets

Document the following targets with your operations team:

- Recovery point objective (RPO): how much recent Platform state you can afford to lose.
- Recovery time objective (RTO): how long Platform can remain unavailable.
- Recovery owner: who can restore the database, recreate infrastructure, and validate the application.

Your deployment model directly affects these targets:

- Kubernetes and Helm deployments can be rebuilt on new infrastructure more easily, especially when Platform runs with external managed database and Redis services.
- Docker Compose deployments are single-instance by design. Restoring them normally requires application downtime while the host, configuration, and backing services are rebuilt.

## Recommended backup strategy

At minimum, maintain:

1. Regular database backups or snapshots for the SQL database used by Platform.
2. Version-controlled copies of your deployment manifests and configuration overrides.
3. A secure copy of the active crypto secret key and any required supporting secrets.
4. A written restore runbook that includes DNS, ingress, load balancer, and certificate steps.

For production environments, use the backup and replication features provided by your infrastructure:

- Managed SQL backups, snapshots, and cross-region replicas where required by your RPO and RTO.
- Backups for any persistent volumes or host-attached storage used by your deployment.
- Registry mirroring for Platform images if your environment cannot rely on direct access to `cr.seqera.io` during recovery.

## Recovery workflow

### Kubernetes or Helm deployments

1. Recreate or fail over the Kubernetes cluster and its supporting infrastructure.
2. Restore access to the SQL database, Redis service, secrets, ingress, and DNS records.
3. Reapply your Helm values or Kubernetes manifests.
4. Restore the SQL database from the selected backup or snapshot.
5. Confirm that Platform starts with the same crypto secret key used to encrypt the existing database contents.
6. Validate login, workspace access, and workflow launch behavior.

### Docker Compose deployments

1. Provision a replacement host or recover the existing host.
2. Restore `tower.env`, `tower.yml`, `docker-compose.yml`, certificates, and secret material.
3. Restore or recreate the external SQL database and Redis service used by Platform.
4. Start Platform with `docker compose up` and allow migrations and startup checks to finish.
5. Validate login, workspace access, and workflow launch behavior before switching traffic back.

## Validation checklist

Test your DR plan on a schedule that matches your organization's risk requirements. During each exercise, confirm that you can:

- Restore the database from a recent backup.
- Start Platform with the correct crypto secret key and configuration.
- Reach the frontend through the expected DNS and TLS path.
- Log in and access organizations, workspaces, and compute environments.
- Launch a small workflow to verify end-to-end operation.

The [Test deployment](./testing) guide provides a simple post-recovery smoke test you can adapt for DR exercises.

## Related guides

- [Platform installation overview](./install-platform)
- [Platform: Helm](./platform-helm)
- [Platform: Kubernetes](./platform-kubernetes)
- [Platform: Docker Compose](./platform-docker-compose)
- [Test deployment](./testing)
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ Seqera Platform Enterprise can be deployed using Docker Compose, Kubernetes, or

See each deployment guide for detailed requirements.

For backup, restore, and recovery planning, see [Platform disaster recovery](./disaster-recovery).

## Prerequisites

:::info
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,7 @@ Seqera Platform offers a service that optimizes pipeline resource requests. Refe
:::note
Studios is available from Seqera Platform v24.1. If you experience any problems during the deployment process please contact your account executive. Studios in Enterprise is not installed by default.
:::

## Disaster recovery planning

Docker Compose deployments are single-instance by design, so recovery normally requires service downtime while you restore the host, configuration, and backing services. For backup, restore, and validation guidance, see [Platform disaster recovery](./disaster-recovery).
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,10 @@ helm upgrade my-release oci://public.cr.seqera.io/charts/platform \
--values my-values.yaml
```

## Disaster recovery planning

Define your backup, restore, and validation procedure before promoting a Helm deployment to production. For DR guidance, including database backups, crypto key handling, and post-restore checks, see [Platform disaster recovery](./disaster-recovery).

## Uninstalling the Helm chart

To uninstall the Seqera Platform Enterprise Helm chart, run the following command, replacing `my-release` and `my-namespace` with your release name and namespace:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,8 @@ To configure Seqera Enterprise for high availability, note that:
- The `cron` service may only have a single instance
- The `groundswell` service may only have a single instance

For backup, restore, and validation planning, see [Platform disaster recovery](./disaster-recovery).

[aws-configure-ingress]: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.2/guide/ingress/annotations/
[azure-configure-ingress]: https://docs.microsoft.com/en-us/azure/application-gateway/ingress-controller-annotations
[google-configure-ingress]: https://cloud.google.com/kubernetes-engine/docs/concepts/ingress
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: "Production checklist"
description: "A pre-production checklist for Seqera Platform."
date created: "2025-07-03"
last updated: "2026-03-25"
last updated: "2026-04-07"
tags: [production, checklist, deployment, limitations, retry]
---

Expand Down Expand Up @@ -83,6 +83,17 @@ Do not rotate credentials during active pipeline runs. Schedule rotations during

Use [Pipeline Secrets](../secrets/overview) to manage sensitive values such as API keys for third-party services. Secrets are injected at runtime and are not exposed in pipeline logs or configuration files.

## Disaster recovery planning

Teams often discover gaps in disaster recovery planning only when they are asked to prepare for an audit or simulation exercise. Before go-live:

- Define your recovery time objective (RTO) and recovery point objective (RPO).
- Decide whether your DR scenario assumes in-place recovery or full account recreation.
- Verify that you back up the Seqera database, deployment configuration, secrets, TLS assets, and external dependency configuration on a schedule that matches your RPO.
- Run at least one recovery drill in a non-production environment and record the real recovery time and manual steps required.

See [Disaster recovery](../enterprise/disaster-recovery) for a deployment-focused recovery planning guide.

## Compute environment permissions

Permissions within shared compute environments are a frequent source of unexpected behavior, particularly when multiple teams use the same workspace.
Expand Down
Loading