Archives private Azure Container Registry (ACR) images into docker load-compatible .tar.gz files and uploads them to Azure Blob Storage Archive tier.
This project is designed to run as an Azure Container Apps Job triggered by Azure Queue Storage backlog. Each execution processes exactly one image.
This repo assumes all of the following already exist:
-
An Azure Container Apps managed environment in the target resource group.
-
A storage account that will be used for:
- the queue of work items,
- the blob container that stores archives and manifests,
- the Azure Files share used as scratch space.
-
An Azure Queue Storage queue for work items.
-
An Azure Blob container for archive output.
-
An Azure Files share mounted into the ACA Job as scratch storage.
-
A source Azure Container Registry containing the images to archive.
-
A deployable worker image published to a registry that ACA can pull from.
This worker also assumes:
- Queue messages are JSON with explicit destination naming.
- Each message includes a fully qualified image reference with an explicit tag.
- The output artifact must remain
docker loadcompatible. - Scratch storage is required because the current implementation writes a temporary uncompressed
docker-archivetar before gzip compression.
For each queue message, the worker:
- Logs into Azure using managed identity.
- Claims exactly one queue message using a visibility timeout.
- Reads the source image reference and output blob name from the message.
- Authenticates to ACR using
az acr login --expose-token. - Inspects the image with
skopeoto capture digest metadata. - Copies the image to
docker-archiveformat. - Compresses the archive with
pigz. - Uploads the
.tar.gzto Blob Storage. - Uploads a sidecar manifest JSON.
- Deletes the queue message only after successful uploads.
If the worker crashes before deleting the queue message, Azure Queue Storage makes the message visible again after the visibility timeout expires.
{
"image": "transitionmonitordockerregistry.azurecr.io/hello-pacta:latest",
"blobContainer": "images",
"blobBaseName": "hello-pacta--latest.tar.gz"
}Notes:
-
imagemust include an explicit tag. -
blobBaseNameintentionally uses--instead of:. -
This message produces:
images/hello-pacta--latest.tar.gzimages/hello-pacta--latest.tar.gz.json
Example sidecar manifest:
{
"image": "transitionmonitordockerregistry.azurecr.io/hello-pacta:latest",
"archiveBlob": "images/hello-pacta--latest.tar.gz",
"createdAtUtc": "2026-04-08T12:34:56Z",
"mediaType": "docker-archive+gzip",
"compression": {
"format": "gzip",
"level": 1
},
"source": {
"registry": "transitionmonitordockerregistry.azurecr.io",
"repository": "hello-pacta",
"tag": "latest",
"digest": "sha256:..."
}
}STORAGE_ACCOUNT=<storage-account-name>
SCRATCH_DIR=/mnt/scratch
PIGZ_LEVEL=1
AZURE_STORAGE_AUTH_MODE=loginQUEUE_ACCOUNT=<storage-account-name>
QUEUE_NAME=<queue-name>
QUEUE_VISIBILITY_TIMEOUT_SECONDS=3600AZURE_CLIENT_ID=<user-assigned-managed-identity-client-id>If AZURE_CLIENT_ID is omitted, the worker uses the job's system-assigned managed identity.
This project supports two authentication models for ACR:
-
Assign the ACA Job identity:
AcrPullon the source ACR
-
The worker uses:
az acr login --expose-token
If you are using an ACR token + password, you do not need RBAC on the registry.
Instead, pass credentials to the worker and use them with skopeo.
Required environment variables:
ACR_USERNAME=<token-name>
ACR_PASSWORD=<token-password>And the worker should use:
--src-creds "$ACR_USERNAME:$ACR_PASSWORD"Notes:
- This bypasses
az acr login --expose-token - Useful when you want scoped access without granting RBAC
- Recommended for tightly controlled, read-only access
Regardless of ACR auth method, storage access still uses managed identity:
Required roles on the storage account:
Storage Queue Data ContributorStorage Blob Data Contributor
Azure Container Apps supports mounting Azure Files into the environment, but the storage registration currently requires a storage account key. Microsoft explicitly states that Container Apps does not support identity-based access to Azure file shares for this mount path. If key access is disabled on your storage account, the Azure Files mount cannot be configured the normal ACA way.
That means your current storage settings create a blocker for the scratch mount approach.
Example:
docker build -t ghcr.io/rmi/workflow-acr-cold-archive:pr-2 .
docker push ghcr.io/rmi/workflow-acr-cold-archive:pr-2You need:
- queue:
docker-transfer - blob container:
images - Azure Files share:
transfer-scratch
The deployment template needs these values:
containerAppsEnvironmentResourceIdimagestorageAccountNamequeueNamefileShareName- image registry settings if the worker image is private
Your current values are:
- environment:
pacta-test - image:
ghcr.io/rmi/workflow-acr-cold-archive:pr-2 - storage account:
pactadockerimages - queue:
docker-transfer - fileshare:
transfer-scratch
After the ACA Job exists, grant its identity access to:
- source ACR:
AcrPull - storage account:
Storage Queue Data Contributor - storage account:
Storage Blob Data Contributor
Example:
{
"image": "transitionmonitordockerregistry.azurecr.io/hello-pacta:latest",
"blobContainer": "images",
"blobBaseName": "hello-pacta--latest.tar.gz"
}Confirm that:
- one execution claims one message,
- the worker uploads
.tar.gz, - the worker uploads
.json, - the queue message is deleted only after success.
Local file-mode testing is still supported with:
export STORAGE_ACCOUNT=<storage-account-name>
export QUEUE_MESSAGE_FILE=examples/message.json
./archive_worker.shBut cloud-first testing is the preferred path for this project.
docker load < hello-pacta--latest.tar.gzFor this job, start with a system-assigned managed identity on the ACA Job.
Reasons:
- It is the fastest path to get running.
- ACA Jobs support system-assigned and user-assigned identities.
- The same identity can be used for queue scaling and for the worker's Azure authentication.
- You do not need to create a separate identity resource unless you want to reuse the identity across multiple jobs or apps.
Use a user-assigned identity later only if you want shared lifecycle or reuse across multiple resources.
Because your storage account currently has key access disabled, you will need to resolve scratch storage before the ACA deployment can mount Azure Files. The simplest options are:
- use a different storage account for the Azure Files scratch share where key-based mount is allowed, or
- use another execution platform for the scratch-heavy step.
Given ACA's current storage mount behavior, option 1 is the cleanest path if you want to keep ACA Jobs.
az deployment group create \
--resource-group RMI-SP-PACTA-WEU-PAT-DEV \
--template-file aca-job-deploy.bicep \
--parameters \
containerAppsEnvironmentResourceId='/subscriptions/feef729b-4584-44af-a0f9-4827075512f9/resourceGroups/RMI-SP-PACTA-WEU-PAT-DEV/providers/Microsoft.App/managedEnvironments/pacta-test' \
image='ghcr.io/rmi/workflow-acr-cold-archive:pr-2' \
useSystemAssignedIdentity=false \
userAssignedIdentityResourceId='/subscriptions/feef729b-4584-44af-a0f9-4827075512f9/resourceGroups/RMI-SP-PACTA-WEU-PAT-DEV/providers/Microsoft.ManagedIdentity/userAssignedIdentities/docker-transfer' \
storageAccountName='pactadockerimages' \
fileShareName='docker-transfer-scratch' \
storageAccountKey="$STORAGE_KEY" \
queueName='docker-transfer'./prepare_acr_queue_messages.sh \
--acr-name transitionmonitordockerregistry \
--storage-account pactadockerimages \
--queue-name docker-transfer \
--blob-container images \
--output-file messages.jsonl