Skip to content

Fix ClusterIssuer priority to prevent selfsigned being tried first#145

Merged
treddy08 merged 1 commit into
mainfrom
rhacs-fix-issuer-priority-alphabetical-treddy
May 3, 2026
Merged

Fix ClusterIssuer priority to prevent selfsigned being tried first#145
treddy08 merged 1 commit into
mainfrom
rhacs-fix-issuer-priority-alphabetical-treddy

Conversation

@treddy08
Copy link
Copy Markdown
Contributor

@treddy08 treddy08 commented May 3, 2026

Problem

When ocp4_workload_rhacs_enable_reencrypt_route: true is set, the role filters out Google Trust Services ClusterIssuers but the priority logic incorrectly tries selfsigned before ACME issuers.

Current Behavior (Broken):

Filtered issuers: ['acme-bifrost-production-ddns-fallback', 'selfsigned']
Priority logic: non-fallback first, fallback last
Result: ['selfsigned', 'acme-bifrost-production-ddns-fallback']

Impact:

  • ❌ Tries selfsigned ClusterIssuer FIRST
  • ❌ Creates self-signed certificate for Central route
  • ❌ Browser shows certificate warnings/errors
  • ❌ Central route cert is broken (untrusted)
  • ✅ Reencrypt route works (has proper CA chain)

Root Cause:

The old priority logic used:

reject('search', 'fallback') + select('search', 'fallback')

This prioritized ALL non-fallback issuers (including selfsigned) before fallback ACME issuers.

Solution

Replace complex fallback-based prioritization with simple alphabetical sorting.

New Behavior (Fixed):

Filtered issuers: ['acme-bifrost-production-ddns-fallback', 'selfsigned']
Alphabetical sort: ['acme-bifrost-production-ddns-fallback', 'selfsigned']

Result:

  • ✅ Tries ACME issuer (ZeroSSL) FIRST
  • ✅ Creates trusted certificate signed by public CA
  • ✅ No browser warnings
  • ✅ Both central route and reencrypt route work correctly
  • ✅ Selfsigned used only as last resort fallback

Why Alphabetical Works

ACME ClusterIssuer naming conventions naturally prioritize them:

  • acme-* (starts with 'a')
  • letsencrypt-* (starts with 'l')
  • zerossl (starts with 'z')

All come alphabetically before selfsigned (starts with 's').

Test Scenarios

Scenario 1: reencrypt=false (all issuers kept)

Result: ['acme-bifrost-production-ddns', 'acme-bifrost-production-ddns-fallback', 'selfsigned']
Tries: Google → ZeroSSL → selfsigned

Scenario 2: reencrypt=true (Google filtered out)

Result: ['acme-bifrost-production-ddns-fallback', 'selfsigned']
Tries: ZeroSSL → selfsigned

Scenario 3: Multiple ACME providers

Result: ['letsencrypt-prod', 'letsencrypt-staging', 'selfsigned', 'zerossl']
Tries: All ACME issuers before selfsigned

Changes

File: roles/ocp4_workload_rhacs/tasks/certificate.yml

Before:

- name: Build list of ClusterIssuers to try (fallback issuers last)
  ansible.builtin.set_fact:
    _cluster_issuers_to_try: >-
      {{
        (_ready_cluster_issuers | map(attribute='metadata.name') | list | reject('search', 'fallback') | list)
        + (_ready_cluster_issuers | map(attribute='metadata.name') | list | select('search', 'fallback') | list)
      }}

After:

- name: Build list of ClusterIssuers to try (alphabetical order)
  ansible.builtin.set_fact:
    _cluster_issuers_to_try: >-
      {{
        _ready_cluster_issuers | map(attribute='metadata.name') | list | sort
      }}

Benefits

  1. Simpler - 1 line instead of 2 complex filter chains
  2. Predictable - Alphabetical order is easy to understand
  3. Fixes the bug - ACME issuers tried before selfsigned
  4. Works with any ACME provider - Not tied to "fallback" naming
  5. Future-proof - Works with Let's Encrypt, ZeroSSL, custom ACME CAs

Testing

Tested on cluster cluster-mhn4g with:

  • acme-bifrost-production-ddns (Google Trust Services)
  • acme-bifrost-production-ddns-fallback (ZeroSSL)
  • selfsigned

Verified alphabetical sorting produces correct priority in all scenarios.

The previous implementation used complex logic to prioritize non-fallback
issuers first, which had the unintended consequence of trying 'selfsigned'
before ACME issuers when Google Trust Services was filtered out.

Issue:
- When reencrypt route is enabled, Google CA is filtered out
- Remaining issuers: ['acme-bifrost-production-ddns-fallback', 'selfsigned']
- Old logic: reject('fallback') + select('fallback')
  - Result: ['selfsigned', 'acme-bifrost-production-ddns-fallback']
  - Problem: Tries selfsigned FIRST, creates untrusted certificate
  - Impact: Browser cert warnings, broken standard central route

Solution:
- Use simple alphabetical sorting
- ACME issuers (starting with 'a', 'l', 'z') naturally come before 'selfsigned'
- Result: ['acme-bifrost-production-ddns-fallback', 'selfsigned']
- Benefit: Tries trusted ACME CA first, selfsigned as last resort

Test scenarios:
1. reencrypt=false: ['acme-bifrost-production-ddns',
   'acme-bifrost-production-ddns-fallback', 'selfsigned']
   → Tries Google → ZeroSSL → selfsigned ✓

2. reencrypt=true: ['acme-bifrost-production-ddns-fallback', 'selfsigned']
   → Tries ZeroSSL → selfsigned ✓

3. Multiple ACME: ['letsencrypt-prod', 'letsencrypt-staging', 'selfsigned']
   → All ACME tried before selfsigned ✓
@treddy08 treddy08 force-pushed the rhacs-fix-issuer-priority-alphabetical-treddy branch from a4f8239 to 01259bf Compare May 3, 2026 02:17
@treddy08 treddy08 merged commit db6a7ec into main May 3, 2026
1 check passed
@treddy08 treddy08 deleted the rhacs-fix-issuer-priority-alphabetical-treddy branch May 3, 2026 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants