Skip to content

Phase 1 · SAO human-in-the-loop review gates — approval workflow and intervention design #52

@Luis85

Description

@Luis85

Meta

type: DesignDecision
stage: draft
maturity: L1
created: 2026-05-10
inputs:
  - "Luis85/specorator specs/specorator-agent-orchestrator/design.md — manual dispatch trigger"
  - "Luis85/specorator#168 — Fleet Dashboard / Review Orchestrator Pipeline"
  - "OpenAI harness engineering — guardrails critical at every stage including HITL"
  - "#45 — feedback sensors L6 (human review gate)"
related: ["#43", "#45", "#49", "#50", "#42"]

Purpose. Design how humans intervene in, approve, and override SAO agent runs — beyond the existing manual dispatch trigger — as a first-class harness concern.


Context

HITL (Human-in-the-Loop) is a core Specorator principle: the runtime emits artifact proposals; humans review, edit, and accept or reject. The SAO must extend this principle into automated agent runs.

The current SAO design doc includes manual dispatch as a trigger but doesn't specify:

  • How humans approve or reject an agent's output before stage advancement
  • Whether auto-advance can be toggled per-feature or per-stage
  • How humans can abort a running agent without corrupting the worktree
  • How the fleet dashboard (specorator#168) surfaces pending-review runs

HITL interaction patterns

Pattern Description V1?
Auto-advance mode On all sensors pass → automatically merge and advance Optional (default off)
Review gate mode On all sensors pass → notify human; hold in pending-review Default for V1
Human abort Kill the agent process mid-run; preserve worktree for inspection Yes
Human override Advance stage manually, bypassing sensor evaluation Yes
Human reject Discard agent output; optionally adjust template before retry Yes
Human retry Re-dispatch with same or modified template after rejection Yes

State extension

The five-state taxonomy from #43 (unclaimed → claimed → running → retry-queued → released) needs a pending-review state for V1 HITL:

running ──[sensors pass]──→ pending-review ──[approved]──→ merge + advance
                                           └──[rejected]──→ retry-queued or released
                        └──[sensors fail]──→ retry-queued

Decision needed: is pending-review a sixth orchestration state, or a sub-state of released?


OrchestratorPort HITL verbs

interface OrchestratorPort {
  // ... existing dispatch verbs
  approveRun(runId: string): Promise<Result<void, OrchestratorError>>;
  rejectRun(runId: string, reason?: string): Promise<Result<void, OrchestratorError>>;
  abortRun(runId: string): Promise<Result<void, OrchestratorError>>;
  overrideAdvance(featureSlug: string, stage: string): Promise<Result<void, OrchestratorError>>;
}

Fleet dashboard integration

The fleet dashboard (specorator#168) is the primary surface for reviewing pending runs. Requirements:

  • pending-review indicator visible in the feature matrix
  • Run diff/artifact preview accessible from the fleet row
  • Approve / reject / abort actions available inline
  • Notification when a run enters pending-review (plugin notification or status bar)

Auto-advance toggle

interface AgentOrchestratorSettings {
  autoAdvance: boolean;                // global default; default false
  perStageAutoAdvance?: Record<string, boolean>;  // stage-level override
}

When autoAdvance: false (default), every successful sensor pass creates a pending-review entry. When true, successful runs merge and advance automatically.


Abort safety

Aborting a running agent process must:

  1. Send SIGTERM to the Claude CLI subprocess (graceful)
  2. Wait up to abortTimeoutMs for process exit
  3. Send SIGKILL if timeout exceeded
  4. Preserve the worktree for human inspection (do not auto-delete)
  5. Transition to released with reason ABORTED_BY_USER
  6. Emit orchestration.aborted event (→ Phase 1 · Failure-event taxonomy — which *.failed events ship in V1 #21)

Acceptance

  • HITL interaction patterns ratified for V1 (which patterns ship, which defer)
  • pending-review state decision: sixth state or sub-state of released
  • OrchestratorPort HITL verbs specified
  • Auto-advance toggle design ratified (default off; per-stage configurable)
  • Abort safety protocol specified (SIGTERM → wait → SIGKILL)
  • ABORTED_BY_USER and HUMAN_REJECTED reason codes added to released taxonomy
  • Fleet dashboard (specorator#168) pending-review integration documented
  • orchestration.aborted event requirement forwarded to Phase 1 · Failure-event taxonomy — which *.failed events ship in V1 #21 (failure-event taxonomy)

Metadata

Metadata

Assignees

No one assigned

    Labels

    roadmap:architecturePhase 1: ratified architecture proposal, data model, and design decisions before code.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions