Skip to content

Document: pre-26.06.1 token re-issue path for cluster upgrades #457

@xe-nvdk

Description

@xe-nvdk

Phase A (#451) introduced cluster-wide token replication. Tokens created on a pre-26.06.1 node remain valid only on that node after upgrade — they don't auto-migrate into the Raft FSM.

This is intentional (auto-migration adds a one-shot code path with its own failure modes — see Phase A planning doc) but it does mean operators upgrading a multi-node Arc Enterprise cluster need a clean re-issue workflow.

Current state (as of v26.06.1 release)

  • Release notes section "No migration of pre-existing tokens" calls out the policy.
  • docs.basekick.net configuration/authentication.md covers the divergence-detection error log + the SQL cleanup for ID collisions.
  • Both compose and Helm chart documentation point at the leader-only-banner behaviour.

What's NOT yet documented as a single runbook

A step-by-step upgrade procedure:

  1. Pre-upgrade: enumerate active tokens across all nodes (operator may need to grep each node's auth.db separately because they don't share state today).
  2. Roll the cluster forward to v26.06.1.
  3. Confirm arc_cluster_auth_apply_create_total counter is incrementing in sync on every node (proves the FSM replication path is alive).
  4. Re-issue each previously-active token via POST /api/v1/auth/tokens from any node (the new tokens are cluster-wide).
  5. Distribute the new token values to downstream consumers (CI secrets, SDKs, dashboards).
  6. Revoke the old per-node tokens via POST /api/v1/auth/tokens/:id/revoke (the revoke also replicates cluster-wide).
  7. Optional: drop the diverging pre-26.06.1 AUTOINCREMENT rows from local auth.db files to clean up the divergence-detection error log noise.

Where to put this

Likely a new "Upgrading to v26.06.1" page under docs.basekick.net operations/ or installation/upgrades/. Could also be a release-notes addendum once we have customer feedback on what they actually do.

Out of scope

Auto-migration tooling — explicitly deferred. The risk/reward isn't worth it for what's likely a once-per-cluster operation.

Related: Phase A memory note.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions