charts: set runAsUser/runAsGroup=65532 on container securityContext (6 components) (AROSLSRE-929)#5374
Conversation
…6 components) All six ARO-HCP-owned charts that declare runAsNonRoot: true without runAsUser have been hit by the same potential failure mode as kube-applier on 2026-05-24 (CreateContainerConfigError: container has runAsNonRoot and image will run as root, when the image OCI config User field is empty on the registry side). Each Dockerfile already sets USER 65532:65532. Making runAsUser and runAsGroup explicit in the chart removes the dependency on registry-side image metadata being preserved through the buildkit / multi-stage pipeline. Pod admission becomes deterministic regardless of build artefacts. Components updated: - admin - backend - frontend (via deploy/templates/_helpers.tpl) - mgmt-agent - sessiongate - tooling/aro-hcp-exporter Fixtures regenerated via UPDATE=true go test in tooling/helmtest. Refs https://issues.redhat.com/browse/AROSLSRE-929 (Story) https://issues.redhat.com/browse/AROSLSRE-930 (admin) https://issues.redhat.com/browse/AROSLSRE-931 (backend) https://issues.redhat.com/browse/AROSLSRE-932 (frontend) https://issues.redhat.com/browse/AROSLSRE-933 (mgmt-agent) https://issues.redhat.com/browse/AROSLSRE-934 (sessiongate) https://issues.redhat.com/browse/AROSLSRE-935 (aro-hcp-exporter)
There was a problem hiding this comment.
Pull request overview
This PR hardens ARO-HCP-owned Helm charts by making the non-root UID/GID explicit in container securityContext, preventing kubelet startup failures when an image’s OCI config User field is missing.
Changes:
- Add
runAsUser: 65532andrunAsGroup: 65532to containersecurityContextin 6 charts that already hadrunAsNonRoot: true. - Regenerate Helm template fixtures/testdata to reflect the new rendered securityContext fields.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| admin/deploy/templates/admin.deployment.yaml | Set container runAsUser/runAsGroup to 65532 alongside runAsNonRoot. |
| admin/zz_fixture_TestHelmTemplate_dev_westus3_svc_1_admin_api.yaml | Regenerated rendered fixture with explicit UID/GID. |
| admin/testdata/zz_fixture_TestHelmTemplate_admin_api_mise_enabled.yaml | Regenerated rendered testdata fixture with explicit UID/GID. |
| backend/deploy/templates/backend.deployment.yaml | Set container runAsUser/runAsGroup to 65532 alongside runAsNonRoot. |
| backend/zz_fixture_TestHelmTemplate_dev_westus3_svc_1_aro_hcp_backend_dev.yaml | Regenerated rendered fixture with explicit UID/GID. |
| backend/testdata/zz_fixture_TestHelmTemplate_backend_mi_mock_and_arm_perms_mgr_identities_unset.yaml | Regenerated rendered testdata fixture with explicit UID/GID. |
| backend/testdata/zz_fixture_TestHelmTemplate_backend_clstr_scoped_identities_role_set_name_public.yaml | Regenerated rendered testdata fixture with explicit UID/GID. |
| frontend/deploy/templates/_helpers.tpl | Add explicit UID/GID to the frontend deployment helper’s container securityContext. |
| frontend/zz_fixture_TestHelmTemplate_dev_westus3_svc_1_aro_hcp_frontend_dev.yaml | Regenerated rendered fixture (both containers show explicit UID/GID). |
| frontend/testdata/zz_fixture_TestHelmTemplate_frontend_mise_enabled.yaml | Regenerated rendered testdata fixture (both containers show explicit UID/GID). |
| frontend/testdata/zz_fixture_TestHelmTemplate_frontend_connect_socket.yaml | Regenerated rendered testdata fixture (both containers show explicit UID/GID). |
| mgmt-agent/deploy/templates/deployment.yaml | Set container runAsUser/runAsGroup to 65532 alongside runAsNonRoot. |
| mgmt-agent/zz_fixture_TestHelmTemplate_dev_westus3_mgmt_1_mgmt_agent.yaml | Regenerated rendered fixture with explicit UID/GID. |
| sessiongate/deploy/templates/deployment.yaml | Set container runAsUser/runAsGroup to 65532 alongside runAsNonRoot. |
| sessiongate/zz_fixture_TestHelmTemplate_dev_westus3_svc_1_sessiongate.yaml | Regenerated rendered fixture with explicit UID/GID. |
| tooling/aro-hcp-exporter/deploy/templates/deployment.yaml | Set container runAsUser/runAsGroup to 65532 alongside runAsNonRoot. |
| dev-infrastructure/zz_fixture_TestHelmTemplate_dev_westus3_svc_1_aro_hcp_exporter.yaml | Regenerated rendered fixture with explicit UID/GID. |
|
/test e2e-parallel |
tuxerrante
left a comment
There was a problem hiding this comment.
LGTM — clean, well-scoped security hardening. A few confirmations and a minor follow-up note:
UID 65532 is correct. All 6 Dockerfiles consistently set USER 65532:65532 (the distroless nonroot convention). This is distinct from UID 65534 (nobody, the traditional POSIX system account) — both are valid non-root UIDs but 65532 is the de facto standard for application workloads in distroless/Chainguard base images. Chart values match.
Volume ownership — no concern. Checked all 6 deployments:
mgmt-agent,sessiongate,aro-hcp-exporter— no volumes mounted at all.backend— CSI secret store (readOnly: trueat volume level) + ConfigMap (readOnly: trueon volumeMount). Both effectively read-only.admin,frontend— only writable mount ismdsd-asa-run-vol(hostPath/var/run/mdsd), used for connecting to an existing Unix domain socket, not file creation. NofsGroupor ownership fixup needed.
Minor follow-up (not blocking): backend's backend-service-key-vault volumeMount is missing an explicit readOnly: true on the mount spec — the CSI volume definition enforces it at the driver level, but adding it to the volumeMount would be defense-in-depth. Pre-existing, not introduced by this PR.
Cross-repo note: ARO-RP has the same runAsNonRoot: true without runAsUser anti-pattern in 2 Gatekeeper static resource YAMLs (pkg/operator/controllers/guardrails/staticresources/gk_*_deployment.yaml). Lower risk since these track upstream Gatekeeper, but worth a preventive follow-up.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: raelga, tuxerrante The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Addendum to the cross-repo note in my review above: after deeper investigation, no fix is needed in ARO-RP. The 2 Gatekeeper static resource YAMLs with the same |
What
Add
runAsUser: 65532andrunAsGroup: 65532to the containersecurityContextof every ARO-HCP-owned chart that currently setsrunAsNonRoot: truewithout an explicit user.Why
On 2026-05-24, the
kube-applierdeployment failed for 72+ minutes onint-uksouth-mgmt-1withCreateContainerConfigError: container has runAsNonRoot and image will run as root(fix shipped via #5373, AROSLSRE-926). Root cause was the same anti-pattern across the repo: chart saysrunAsNonRoot: true, norunAsUser, so kubelet validates against the image OCI configUserfield. When that field is empty on the registry (as it was forarohcpsvcint.azurecr.io/kube-applier@sha256:9378a76…— likely a buildkit / multi-stage cache artefact), the pod is rejected.Audit (AROSLSRE-929 / closed AROSLSRE-927) identified 6 ARO-HCP-owned charts with the same anti-pattern. Each Dockerfile already sets
USER 65532:65532so the fix is purely additive in the chart and matches established behavior. Making the UID/GID explicit in the pod spec removes the dependency on registry-side metadata being preserved through future buildkit changes.Components covered
65532:65532admin/deploy/templates/admin.deployment.yaml65532:65532backend/deploy/templates/backend.deployment.yaml65532:65532frontend/deploy/templates/_helpers.tpl(definefrontend.deployment)65532:65532mgmt-agent/deploy/templates/deployment.yaml65532:65532sessiongate/deploy/templates/deployment.yaml65532:65532tooling/aro-hcp-exporter/deploy/templates/deployment.yamlThe diff is identical in every chart:
Out of scope
kube-applier— already fixed in kube-applier: set runAsUser/runAsGroup=65532 on container securityContext (AROSLSRE-926) #5373.acrpull,observability/prometheus,route-monitor-operator— already declarerunAsUserin their charts.open-cluster-managementupstream, not patched here.frontend/deploy/templates/frontend.secret-refresher.yaml— busybox sidecar, nosecurityContextat all (separate hardening item).image-sync/oc-mirrorDockerfile has noUSER; only used in CI pipeline shell, not k8s-deployed.Userfield disappearing on registry push (buildkit / multi-stage cache). Chart-side fix here is the safest backstop regardless.Validation
cd tooling/helmtest && UPDATE=true go test ./...— fixtures regenerated for all 6 components (and downstream frontend fixtures that include the helper template)cd tooling/helmtest && go test ./...— green without UPDATE flagfrontend/zz_fixture_TestHelmTemplate_dev_westus3_svc_1_aro_hcp_frontend_dev.yaml): both containers in the rendered Deployment now carryrunAsUser: 65532/runAsGroup: 65532Refs