Skip to content

Kickoff orchestrator: .design/*.pipeline.json runs array grows monotonically; entries never reconciled to completed #614

@maxine-at-forecast

Description

@maxine-at-forecast

The kickoff orchestrator appends to the runs array of the per-project pipeline-state JSON (e.g. .design/forecast-decode.pipeline.json) on launch but never reconciles those entries on completion. Across kickoffs the array fills with stale "status": "running" rows that don't correspond to any live agent.

What we observed in forecast-bio/decode

Each launch appended:

{
  "agent_id": "pending",
  "worktree": "pending",
  "issue_id": 1,
  "started_at": "2026-05-12T20:22:43.929777+00:00",
  "status": "running"
}

By the time we noticed (2026-05-21) the array had 14 stale rows spanning launches from 2026-05-12 through 2026-05-16, including:

  • Several from kickoffs that landed merged PRs (qKOS-OFH5, qKOS-oJUH, V45R, liKe-coQ8) — these had real agent_ids in their .kickoff-metadata.json, but the pipeline.json entries stayed at "agent_id": "pending".
  • Several from aborted launches that never produced any commits.

The commits that look like reconciliation attempts in the repo history (e.g. 9293393 "Update pipeline state after qKOS-OFH5 kickoff completion", 57e5a10 "Update pipeline state JSON after qKOS-oJUH kickoff completion") don't actually update the previous launch entry — they just append another row with the same "agent_id": "pending", "status": "running" shape. So the existing rot stays and a new stale row joins it.

Repro from forecast-bio/decode (every kickoff in the project):

git log --oneline -- .design/forecast-decode.pipeline.json
# fd37e07 Foundation: workspace + 9 self-contained crates from first kickoff
# 57e5a10 Update pipeline state JSON after qKOS-oJUH kickoff completion
# 9293393 Update pipeline state after qKOS-OFH5 kickoff completion

git show 9293393 -- .design/forecast-decode.pipeline.json | head -25
# Diff is +7 lines (one new "pending" entry) — no mutation of the existing rows.

We cleared the accumulated rot in forecast-bio/decode@5eaeb8e (set runs: []).

Suggested fix surface

Two reasonable directions, your call which is cleaner upstream:

  1. Reconcile on completion. When a kickoff finishes, the orchestrator updates the matching launch row in place: fill in real agent_id + worktree, set status: "completed" | "failed" | "timed_out", add completed_at. Matching probably keys on (issue_id, started_at) or the launch writes a row identifier the completion hook reads back.

  2. Drop the runs array. If it isn't load-bearing for the orchestrator itself (we couldn't find anything in the decode workspace that reads it), just stop writing to it. Kickoff state lives in .crosslink/issues.db + the worktree's .kickoff-status sentinel + the merged-PR commit anyway; the JSON runs array seems to be informational only.

Happy to send the PR if direction is clear — there's also a related question of whether the pipeline.json itself should be a .gitignored local file or stay tracked (right now it's tracked, which is why the rot is committed to history). Mention this if there's an architectural preference.

Filed from forecast-bio/decode#3 (upstream contributions backlog).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions