fix: pin hypershift and clusters-service images to avoid e2e breakage (AROSLSRE-919, AROSLSRE-944)#5371
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates ARO-HCP component image digests while preventing the automated image bumper from advancing the HyperShift operator image to a known-bad “latest” build (domain shadowing validation regression), by pinning the HyperShift source tag in the image-updater configuration.
Changes:
- Pin
tooling/image-updater/config.yamlHyperShiftsource.tagto a specific commit SHA to stop auto-bumps fromlatest. - Bump multiple component image digests in
config/config.yamland regenerate the rendered dev WestUS3 configs. - Refresh ACM/MCE helm chart artifacts/fixtures to align with updated ACM bundle digests and operand image digests.
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tooling/image-updater/config.yaml | Pins HyperShift image-updater tag to avoid the broken latest image selection. |
| config/config.yaml | Updates default component image digests (non-HyperShift) used across deployments. |
| config/rendered/dev/prow/westus3.yaml | Regenerated rendered config with updated digests for prow dev env. |
| config/rendered/dev/pers/westus3.yaml | Regenerated rendered config with updated digests for pers dev env. |
| config/rendered/dev/perf/westus3.yaml | Regenerated rendered config with updated digests for perf dev env. |
| config/rendered/dev/dev/westus3.yaml | Regenerated rendered config with updated digests for dev env. |
| config/rendered/dev/cspr/westus3.yaml | Regenerated rendered config with updated digests for cspr dev env. |
| config/rendered/dev/ci01/westus3.yaml | Regenerated rendered config with updated digests for ci01 dev env. |
| acm/zz_fixture_TestHelmTemplate_dev_westus3_mgmt_1_mce.yaml | Updated generated helm template fixture output. |
| acm/zz_fixture_TestHelmTemplate_dev_westus3_mgmt_1_mce_crds.yaml | Updated generated CRD fixture output (primarily formatting/line wrapping). |
| acm/zz_fixture_TestHelmTemplate_dev_westus3_mgmt_1_mce_config.yaml | Updated generated policy/CRD fixture output (formatting + content alignment). |
| acm/deploy/helm/multicluster-engine/templates/multicluster-engine-operator.deployment.yaml | Updates operand image digests in the MCE operator deployment template. |
| acm/deploy/helm/multicluster-engine/Chart.yaml | Updates chart sources digest reference for the MCE bundle. |
| acm/deploy/helm/multicluster-engine-crds/templates/multiclusterengines.multicluster.openshift.io.customresourcedefinition.yaml | Updates CRD manifest formatting/content from refreshed bundle. |
| acm/deploy/helm/multicluster-engine-crds/Chart.yaml | Updates chart sources digest reference for the MCE bundle (CRDs chart). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/values.yaml | Updates policy chart image override digests. |
| acm/deploy/helm/multicluster-engine-config/charts/policy/crds/policy.open-cluster-management.io_policysets.yaml | Updates CRD content/formatting (document start + wrapped descriptions). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/crds/policy.open-cluster-management.io_policyautomations.yaml | Updates CRD content/formatting (document start + wrapped descriptions). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/crds/policy.open-cluster-management.io_policies.yaml | Updates CRD content/formatting (document start + wrapped descriptions). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/crds/policy.open-cluster-management.io_placementbindings.yaml | Updates CRD content/formatting (document start + wrapped descriptions). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/crds/apps.open-cluster-management.io_placementrules_crd_v1.yaml | Updates CRD content/formatting (wrapped descriptions). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/crds/agent.open-cluster-management.io_klusterletaddonconfigs_crd.yaml | Adds YAML doc start and adjusts wrapped description formatting. |
| acm/deploy/helm/multicluster-engine-config/charts/policy/charts/grc/templates/grc-policy-addon-role.yaml | Minor formatting-only change (added blank line). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/charts/grc/templates/grc-policy-addon-clusterrole.yaml | Minor formatting-only change (added blank line). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/charts/cluster-lifecycle/templates/klusterlet-addon-role.yaml | Formatting changes to RBAC manifest (including trailing whitespace that needs fixing). |
| acm/deploy/helm/multicluster-engine-config/charts/policy/charts/cluster-lifecycle/templates/klusterlet-addon-role_binding.yaml | Minor formatting-only change (added blank line). |
9a1aa99 to
81abe21
Compare
|
/retest |
81abe21 to
73d75e7
Compare
|
/lgtm This PR contains only image digest bumps copied from the automated bumper PR #5170 (by The only non-digest change is pinning the HyperShift operator tag in |
|
@raelga: you cannot LGTM your own PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test ci/prow/e2e-parallel |
73d75e7 to
4384867
Compare
|
/label lgtm This PR contains only image digest bumps copied from the automated bumper PR #5170 (by The only non-digest change is pinning the HyperShift operator tag in |
|
@raelga: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/lgtm |
…on (AROSLSRE-919) The automated image bumper PR #5170 has been failing e2e for 2+ weeks. The new HyperShift image (d24af10) includes commit be263214 which adds a webhook that rejects Azure HostedClusters when service hostnames shadow the cluster base domain, breaking all cluster creation. This PR applies all image bumps from PR #5170 except the HyperShift operator and shared-ingress images, which remain pinned to the last known working versions. The image-updater config is also updated to pin the hypershift tag to prevent future auto-bumps past the broken version.
…-944) The latest CS image (84b200b) includes swift-nic annotation changes that require HyperShift CPO overrides not yet deployed, breaking cluster creation for 4.23+ and 5.0. Pin CS tag to dbb022a (last known working version) in the image-updater config to prevent auto-bumps past the broken version.
36b278b to
677342f
Compare
The CS digest in config.yaml was bumped to b8a87db (from PR #5170 image bumps) while the image-updater tag was pinned to dbb022a. This mismatch would cause the image-updater to revert the digest on next run. Align both to dbb022a (the last known working CS version).
6c2d315 to
259491c
Compare
259491c to
a267a86
Compare
Update all non-pinned image digests to latest from the automated bumper PR #5170. HyperShift operator and Clusters Service remain pinned to the versions currently on main to avoid the domain shadowing rejection (AROSLSRE-919) and swift-nic breakage (AROSLSRE-944) respectively.
a267a86 to
9bfee61
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: inbharajmani, raelga, sclarkso The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
https://redhat.atlassian.net/browse/AROSLSRE-919
https://redhat.atlassian.net/browse/AROSLSRE-944
What
Applies all image digest bumps from PR #5170 except the HyperShift operator and Clusters Service images, which remain pinned to their current working versions.
Also pins both images in
tooling/image-updater/config.yamlto prevent the automated bumper from picking up the broken versions:latesttocf2b91fbc02ebe5d3d66515cb2ea3e097290ac13latesttodbb022a3dd3f0533ae1c8eebd4e6929ba1ca1edeThe hypershift-shared-ingress image is not pinned — it uses
tagPatternand is a separate component unaffected by either issue.Why
HyperShift (AROSLSRE-919): PR #5170 has been failing e2e for 2+ weeks. Root cause: the new HyperShift operator image (commit d24af10) includes be263214 which adds a webhook validation that rejects Azure HostedClusters when service hostnames shadow the cluster base domain.
Kusto logs show the klusterlet-agent repeatedly failing with:
This blocks all cluster creation (43/74 tests fail with timeout). Meanwhile, PR #5368 (code-only, same base SHA) passes e2e — confirming the environment is healthy and the issue is the image bump.
Clusters Service (AROSLSRE-944): The CS bump PR #5348 (commit
84b200b) includes swift-nic annotation changes (ARO-27209) that require HyperShift CPO overrides (openshift/hypershift#8552) not yet backported and deployed. Without the CPO fix, Kubernetes rejects router pods because limits are required for non-overcommittable resources (aro.openshift.io/swift-nic).Testing
make verify-yamlfmt— passesmake -C config detect-change— passes (no drift)make -C acm helm-charts— ACM charts regenerated from new bundle digestsSpecial notes for your reviewer
tag: "latest") once the domain shadowing validation is fixed. Tracked in AROSLSRE-921.tag: "latest") once HyperShift CPO swift-nic overrides are backported to 4.20+ and deployed to CSPR. Tracked in AROSLSRE-946.