Skip to content

Refactor replication FSM: per-workload tables, idempotent transitions, integrated invalid state #727

@UtkarshBhatthere

Description

@UtkarshBhatthere

Parent: #726

Background

The replication state machine in ceph/replication.go (3 states, 7 events) is shared across RBD and CephFS workloads via a single transition table. Several inconsistencies follow from that:

  • replication_invalid is declared but unreachable — no state config, no edges. CephFS GetResourceState returns it (ceph/replication_cephfs.go:119), after which every event hits the unhandled-trigger handler.
  • CephFS does not support configure, promote, or demote (not implemented upstream). Handlers stub them with errors (ceph/replication_cephfs.go:203,283,289), but the FSM still advertises those events as legal transitions, so the FSM accepts the trigger and the handler then errors. Failure surface is misleading.
  • Self-transitions are wired asymmetrically: disabled state has both OnEntryFrom(disable) and InternalTransition(disable) — the OnEntryFrom is dead in this position. enabled has OnEntryFrom(enable) with no InternalTransition, so re-enable on enabled hits the unhandled handler. RBD handlers contain idempotency guards (ceph/replication_rbd.go:324-326,380-381) that the FSM rejects before they can run.
  • Workload-level promote and demote (URL PUT /ops/replication/{wl}, no resource ID) currently go through the per-resource FSM with an empty resource. GetResourceState runs against a zero-valued struct, FSM seeds with whatever state that produces, transition succeeds, handler iterates real pools and silently filters. Net: promote on a cluster with zero enabled mirrors returns 200 with nothing done.
  • The FSM is per-request and per-resource, but several handlers operate site-wide. The model claims more uniformity than it delivers.

Proposed changes

1. Per-workload FSM construction

Move FSM construction off the package-level GetReplicationStateMachine and onto each handler via a new interface method:

type ReplicationHandlerInterface interface {
    // ... existing methods
    GetStateMachine(initialState ReplicationState) *stateless.StateMachine
}

Shared scaffolding (logger callback, unhandled-trigger callback, type registration loop) factored into a private newBaseFsm helper. Each workload then wires only the events it actually supports.

  • RBD: enable, disable, configure, list, status.
  • CephFS: enable, disable, list, status.

Promote and demote are removed from the FSM entirely (see #4).

CephFS-specific stubbed handlers (ConfigureHandler, PromoteHandler, DemoteHandler) deleted from the interface or made optional. Unsupported events on a workload hit unhandledTransitionHandler and return a clean operation X not permitted error rather than reaching a stub.

2. Integrate invalid state

replication_invalid becomes a recoverable state with two outbound transitions and two read-only internal transitions:

From Event To
invalid enable enabled
invalid disable disabled
invalid list invalid
invalid status invalid

Configure / promote / demote on invalid remain unhandled (operator forced to enable or disable to recover first).

CephFS path that returns StateInvalidReplication (ceph/replication_cephfs.go:119) needs verification during local testing: confirm the disable handler tolerates the underlying condition that triggers invalid in the first place.

3. Idempotent self-loops on enabled and disabled

Replace the dead OnEntryFrom(self) wiring with explicit InternalTransition self-loops on both states:

  • disabled --disable--> disabled (InternalTransition): re-disable runs cleanup. Handlers must tolerate nothing to disable.
  • enabled --enable--> enabled (InternalTransition): re-enable is idempotent. RBD handlers already check PoolInfo.Mode and return nil when target mode is in place.

Rationale: Ceph itself models mirroring as converge-to-target. The FSM should match. Re-enable is not an error; re-disable is the operator's recovery tool when prior disable left orphan state.

4. Site-wide actions bypass the FSM

Workload-level promote and demote (URL PUT /ops/replication/{wl}) are aggregates over many resources, not lifecycle transitions on a single resource. They should not run through the per-resource FSM.

In api/ops_replication.go::handleReplicationRequest, branch before PreFill:

event := req.GetWorkloadRequestType()
if isWorkloadAction(event) {
    return runWorkloadAction(ctx, s, rh, req, event)
}
// existing per-resource FSM path

runWorkloadAction dispatches directly to handler-side promote/demote logic (today's handleSiteOp), which enumerates pools, applies the action where applicable, and returns an aggregate result body:

{ "promoted": 3, "skipped": 1, "errors": [] }

Operator sees explicit counts; promoted 0 of 0 is no longer ambiguous with success.

Promote/demote handler methods stay on the RBD handler. CephFS handler does not implement them; runWorkloadAction returns a clear not-supported error when called against CephFS.

5. Idempotent enable/disable steps

EnableHandler and DisableHandler today perform multi-step Ceph operations with no rollback path. Partial failures leave a half-configured cluster.

Convert each step to an ensureX(ctx) shape: probe current Ceph state, skip if already at target, apply otherwise. Step audit and per-step probe design happen during implementation. Pairs with #3: enable converges forward, disable converges backward, retry after failure picks up where the previous run stopped.

Out of scope

Acceptance

  • All replication tests pass with new per-workload FSM construction.
  • Re-enable on enabled and re-disable on disabled return success and reach handler logic.
  • CephFS configure / promote / demote return a uniform FSM not permitted error rather than handler-stub error.
  • Promote on a cluster with zero enabled rbd mirrors returns an explicit aggregate result body, not silent 200.
  • replication_invalid state can recover via enable or disable.
  • Local testing covers the conditions that drive CephFS into the invalid state.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions