Skip to content

fix FSM lock-expiry failure path and make retry increments atomic #40

@serrrfirat

Description

@serrrfirat

Problem

Lock expiry handling can attempt invalid transitions and increment retry counters non-atomically.

Evidence

  • src/types.ts:561 (claimed valid transitions exclude failed)
  • src/job-fsm.ts:417 (claimed can transition to failed on max retry path)
  • src/job-fsm.ts:434 (retry_count increment happens before transition)

Proposed Fix

  • Make timeout terminalization follow a valid transition path.
  • Update retry_count and state change atomically in one transaction/CAS guard.
  • Add tests for expired claimed and running jobs near max retries.

Acceptance Criteria

  • No invalid transition attempts during lock-expiry cleanup.
  • Retry counters remain consistent with state transitions.
  • Expired jobs reliably reach pending or failed terminal states.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions