Production deployment — PostgreDataMigrationApp

Companion to AZURE_DEPLOY.md (the Dev guide). This file covers the Prod variant: Azure Database for PostgreSQL Flexible Server with HA + backups, private networking, Container Apps Job for managed scheduling, remote Terraform state, audit logging, and a Dev → Prod data migration path.

Pre-read: Make sure your Dev environment is working end-to-end first. The Prod IaC follows the same shape — provisioning unfamiliar production Azure resources is the wrong time to debug Docker.

What's different vs Dev

Layer	Dev	Prod
PG hosting	Container Apps + postgres:16-alpine + Azure Files	Azure Database for PostgreSQL Flexible Server (managed)
PG SLA	None (single container)	99.9% / 99.95% / 99.99% (HA Disabled / SameZone / ZoneRedundant)
Backups	None	PITR 7-35 days, optional geo-redundant
Networking	Public ACR + public KV	Private VNet + private endpoints for PG, ACR, KV
ACR SKU	Basic ($7/mo)	Premium ($70/mo) — content trust, retention policy, private link
Compute	ACI per-run	Container Apps Job in the same VNet (managed scheduling)
Image build	Docker on GitHub runner	ACR Tasks remote build (works with private ACR)
State	Local or single Azure blob	Remote state with state locking required
Audit	Container logs only	PG/KV/ACR diagnostic settings → Log Analytics (90 days)
CI/CD gate	None	GitHub `production` Environment with required reviewers
Destroy guard	One click	Typed `PROD-DESTROY-CONFIRM` + reviewer approval

Cost estimate (australiaeast)

Component	Default	With HA ZoneRedundant + geo-backups
PG Flexible Server (GP_Standard_D2s_v3, 32 GB)	~AUD 180/mo	~AUD 280/mo
PG backups (14 days)	~AUD 5/mo	~AUD 7/mo (geo)
ACR Premium	~AUD 70/mo	same
VNet + private endpoints (3)	~AUD 25/mo	same
Key Vault Premium	~AUD 1/mo	same
Log Analytics (90 d, ~10 GB/mo)	~AUD 15/mo	same
Container Apps env + Job (idle)	~AUD 5/mo	same
Total	~AUD 300/mo	~AUD 400/mo

Cost dials in variables.tf:

pg_sku — drop to B1ms for non-production proofs (~AUD 25/mo, no SLA)
ha_mode — Disabled saves ~AUD 100/mo; raise only when you're ready
geo_redundant_backups — adds ~30% to backup cost, enables cross-region restore
backup_retention_days — 7 minimum, 35 maximum, linear cost increase

One-time setup

1. Create the Terraform state storage

Production state MUST live in remote Azure storage with state locking.

# Replace REGION and PROD_SUB with your values.
REGION="australiaeast"
SUFFIX=$(openssl rand -hex 3)

az group create --name rg-tfstate --location $REGION
az storage account create \
    --name "sttfstateprod$SUFFIX" \
    --resource-group rg-tfstate \
    --location $REGION \
    --sku Standard_LRS \
    --encryption-services blob \
    --min-tls-version TLS1_2
az storage container create \
    --account-name "sttfstateprod$SUFFIX" \
    --name tfstate \
    --auth-mode login

echo "Now uncomment the backend block in infra/terraform-prod/main.tf"
echo "Set storage_account_name = 'sttfstateprod$SUFFIX'"

Edit infra/terraform-prod/main.tf and uncomment the backend "azurerm" block, replacing the placeholder values.

2. Create a Prod-only service principal with OIDC

Use a SEPARATE SP from Dev. Scope it tightly.

APP_NAME="sp-te-github-actions-prod"
REPO="YOUR_GITHUB_ORG/Migration-using-ai"

APP_ID=$(az ad app create --display-name "$APP_NAME" --query appId -o tsv)
az ad sp create --id "$APP_ID"

# Grant Contributor scoped to a future RG (replace name after first apply
# OR grant on subscription temporarily, then narrow scope).
SUB_ID=$(az account show --query id -o tsv)
az role assignment create \
    --assignee "$APP_ID" \
    --role Contributor \
    --scope "/subscriptions/$SUB_ID"

# Required additional role to assign roles to the runner identity:
az role assignment create \
    --assignee "$APP_ID" \
    --role "User Access Administrator" \
    --scope "/subscriptions/$SUB_ID"

# Federated credentials for both main branch and PR validation
az ad app federated-credential create \
    --id "$APP_ID" \
    --parameters "{
        \"name\": \"github-main\",
        \"issuer\": \"https://token.actions.githubusercontent.com\",
        \"subject\": \"repo:$REPO:environment:production\",
        \"audiences\": [\"api://AzureADTokenExchange\"]
    }"

echo "AZURE_PROD_CLIENT_ID=$APP_ID"
echo "AZURE_TENANT_ID=$(az account show --query tenantId -o tsv)"
echo "AZURE_PROD_SUBSCRIPTION_ID=$SUB_ID"

Note the federated credential uses environment:production (not a branch) — this binds the SP to GitHub Environment approval.

3. Configure GitHub Environment + secrets

Repo -> Settings -> Environments -> New environment: name production.
Add Required reviewers (at least one other person).
Add Deployment branch rule: main only.
In environment secrets, add:
- AZURE_PROD_CLIENT_ID
- AZURE_PROD_SUBSCRIPTION_ID (Tenant ID stays in repo-level secrets — same as Dev.)

Deploy

Step 1 — Provision infrastructure

GitHub -> Actions -> Azure infra — terraform apply (Prod).
Click Run workflow -> action: plan -> Run. Wait for an approving reviewer.
Review the plan output. Re-run with action: apply once reviewed.
~10-15 minutes (PG Flex provisioning + private endpoints are slow).

Step 2 — First migration run

GitHub -> Actions -> Azure migration — build, push, run (Prod).
Click Run workflow:
- action: deploy (start with just the schema)
- change_ticket: e.g. CHG-2026-001 (logged in run summary)
Wait for reviewer approval.
The workflow:
- Builds the image via az acr build (remote build inside Azure VNet)
- Updates the Container Apps Job image tag
- Triggers a single execution of the Job
- Streams logs from Log Analytics
- Exits 0 only if the Job status is Succeeded

Step 3 — Verify

# Get the RG name
RG=$(az group list --tag environment=prod \
       --query "[?starts_with(name, 'rg-te-prod-')].name | [0]" -o tsv)

# Check job execution history
az containerapp job execution list \
    --resource-group "$RG" \
    --name job-migration-prod \
    -o table

# PG sanity check from any VM in the VNet:
psql "host=<pg_fqdn> port=5432 user=pgadmin dbname=te_prod sslmode=require"
\dt te_prod.*

Connecting to private resources

PG, ACR, and KV are all private-endpoint-only. To reach them from your laptop:

Quickest — Azure Cloud Shell (already in the tenant, can reach private endpoints with peering)
Standard — deploy a small VM in snet-endpoints, SSH in, then psql / az from there
Best for teams — Azure Bastion to a Linux jump-box in the VNet
Persistent — Site-to-site VPN or ExpressRoute (use existing corporate connectivity)

Add the jump-box to main.tf if you want it as code; the VNet outputs the subnet IDs you need.

Dev → Prod data migration

When you've validated everything in Dev and want to push the schema + some seed data to Prod for the first time:

Option A — Schema only (recommended)

# 1. Dump just the schema from Dev (no data)
pg_dump --schema-only --no-owner --no-privileges \
        "host=<dev_pg> user=postgres dbname=te_dev" \
        > te_schema.sql

# 2. Apply to Prod from a VM inside the Prod VNet
psql "host=<prod_pg> user=pgadmin dbname=te_prod sslmode=require" \
     -v ON_ERROR_STOP=1 -f te_schema.sql

# 3. Trigger the prod migration workflow with action=load to populate
#    via the framework's normal load_input_data.sql path.

Option B — Schema + reference data

# 1. Dump schema + only the reference tables (not transactional data)
pg_dump --schema-only "..." > schema.sql
pg_dump --data-only --table=te_dev.organisations \
        --table=te_dev.programs \
        --table=te_dev.classifications \
        "..." > reference.sql

# 2. Rename te_dev schema references to te_prod in both files
sed -i 's/te_dev/te_prod/g' schema.sql reference.sql

# 3. Apply both, in order
psql "..." -v ON_ERROR_STOP=1 -f schema.sql
psql "..." -v ON_ERROR_STOP=1 -f reference.sql

Option C — Full Dev clone (rarely the right move)

For early prod proof-of-concept only. Real prod should have its own data.

pg_dump "host=<dev_pg> ..." > full_dev.sql
sed -i 's/te_dev/te_prod/g' full_dev.sql
psql "host=<prod_pg> ..." -f full_dev.sql

Backup and restore

PG Flex backups run automatically (PITR). To restore:

# 1. Take note of the point in time you want to restore to
# 2. Create a new server from the backup
az postgres flexible-server restore \
    --resource-group "$RG" \
    --name "psql-te-prod-restored-$(date +%Y%m%d)" \
    --source-server psql-te-prod-<suffix> \
    --restore-time "2026-06-15T14:30:00Z"

# 3. Update DNS / connection strings to point at the new server
# 4. (Optional) Delete the original server once you're sure

If geo_redundant_backups = true, you can also --restore-time in a paired region for region-failure DR.

Rollback

The cleanest rollback for a bad migration:

Code-level — re-run the workflow with the previous image tag:

Action: deploy
Image tag: <previous-7-char-sha>
Change ticket: ROLLBACK-CHG-<original>

Data-level — PITR restore to a moment before the change.
Infra-level — terraform apply on the previous commit reverts IaC changes. Use terraform refresh first to capture drift.

For schema changes, the framework's ON CONFLICT DO NOTHING discipline means re-running an older deploy is generally safe. New columns added by the bad migration will linger until you DROP them explicitly.

Monitoring

Prod ships logs/metrics to Log Analytics for:

PG queries: AzureDiagnostics | where Category == "PostgreSQLLogs"
PG sessions: AzureDiagnostics | where Category == "PostgreSQLFlexSessions"
KV access: AzureDiagnostics | where Category == "AuditEvent"
ACR pulls/pushes: ContainerRegistryRepositoryEvents | order by TimeGenerated desc
Job runs: ContainerAppConsoleLogs_CL | where ContainerJobName_s == "job-migration-prod"

Suggested alerts (add via azurerm_monitor_metric_alert):

PG CPU > 80% for 15 minutes
PG storage > 85%
PG failed connections > 10/min
Migration job failed
KV secret read failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Production deployment — PostgreDataMigrationApp

What's different vs Dev

Cost estimate (australiaeast)

One-time setup

1. Create the Terraform state storage

2. Create a Prod-only service principal with OIDC

3. Configure GitHub Environment + secrets

Deploy

Step 1 — Provision infrastructure

Step 2 — First migration run

Step 3 — Verify

Connecting to private resources

Dev → Prod data migration

Option A — Schema only (recommended)

Option B — Schema + reference data

Option C — Full Dev clone (rarely the right move)

Backup and restore

Rollback

Monitoring

Production hardening checklist

See also

FilesExpand file tree

PROD_DEPLOY.md

Latest commit

History

PROD_DEPLOY.md

File metadata and controls

Production deployment — PostgreDataMigrationApp

What's different vs Dev

Cost estimate (australiaeast)

One-time setup

1. Create the Terraform state storage

2. Create a Prod-only service principal with OIDC

3. Configure GitHub Environment + secrets

Deploy

Step 1 — Provision infrastructure

Step 2 — First migration run

Step 3 — Verify

Connecting to private resources

Dev → Prod data migration

Option A — Schema only (recommended)

Option B — Schema + reference data

Option C — Full Dev clone (rarely the right move)

Backup and restore

Rollback

Monitoring

Production hardening checklist

See also