From c4366b9a185e3b0b6a6e9facc521d09ffc8d9000 Mon Sep 17 00:00:00 2001 From: Huiyan Wan Date: Fri, 29 May 2026 18:20:27 +0100 Subject: [PATCH] feat(multi-issue): add Phase 0 backlog triage + edit-only-agents note (v1.2.0) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Drawn from a real run of this skill against a 36-issue, partly-stale, no-CI backlog where the skill's "pre-validated issues" precondition didn't hold. - **Phase 0 (conditional): triage a large or possibly-stale backlog.** New section + a "When to use" pointer. Triggers when the cluster is >15, the user flags staleness, or issues are drift-vs-mockup filed before recent redesigns. Pins triage to origin/main, one read-only investigator per issue, a verdict schema BIASED AGAINST DISMISSAL (dismissal requires PR# + file:line — the error asymmetry: a wrong "still-valid" is caught at impl, a wrong "already-fixed" silently drops real work), adversarial verification of ONLY the dismissals, and a user-ratified cut-line (never auto-close). Standalone skill referenced: stale-backlog-triage-adversarial-verify. - **Parallel implementers — keep git in the controller.** Phase A/B note: for concurrent file-disjoint PRs, agents edit + test only in isolated worktrees; controller owns all git. Sidesteps the subagent-git traps (wrong-worktree cd, stranded branch refs, reports-complete-but-PR-unmerged). - Bumps overnight-multi-issue-implementation to 1.2.0 across SKILL.md frontmatter + plugin.json + marketplace.json (also fixes the pre-existing 1.1.0 SKILL.md vs 1.0.0 manifest drift). Validator: `validate_plugins.py` → OK, 0 warnings. Co-Authored-By: Claude Opus 4.8 (1M context) --- .claude-plugin/marketplace.json | 2 +- .../.claude-plugin/plugin.json | 2 +- .../SKILL.md | 40 ++++++++++++++++++- 3 files changed, 40 insertions(+), 4 deletions(-) diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 88861ca..75974ad 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -28,7 +28,7 @@ "name": "overnight-multi-issue-implementation", "source": "./plugins/overnight-multi-issue-implementation", "description": "Overnight autonomous workflow that takes a cluster of related GitHub issues (typically a P1 review-panel finding set) and ships them to merged stacked PRs by morning. Builds on subagent-driven-development with overnight-specific discipline: stacked PRs (so PR2 doesn't wait on a human PR1-merge mid-night), pre-flight tracker-id audit (concurrent sessions on main steal IDs), final PR-level code review before proposing merge, review findings preserved as PR comments before squash. Sister to overnight-review-client-delivery and overnight-insight-discovery.", - "version": "1.0.0" + "version": "1.2.0" }, { "name": "large-redesign-parallel-branch-collision-audit", diff --git a/plugins/overnight-multi-issue-implementation/.claude-plugin/plugin.json b/plugins/overnight-multi-issue-implementation/.claude-plugin/plugin.json index d849974..1fd2636 100644 --- a/plugins/overnight-multi-issue-implementation/.claude-plugin/plugin.json +++ b/plugins/overnight-multi-issue-implementation/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "overnight-multi-issue-implementation", "description": "Overnight autonomous workflow that takes a cluster of related GitHub issues (typically a P1 review-panel finding set) and ships them to merged stacked PRs by morning. Builds on subagent-driven-development with overnight-specific discipline: stacked PRs (so PR2 doesn't wait on a human PR1-merge mid-night), pre-flight tracker-id audit (concurrent sessions on main steal IDs), final PR-level code review before proposing merge, review findings preserved as PR comments before squash. Sister to overnight-review-client-delivery and overnight-insight-discovery.", - "version": "1.0.0", + "version": "1.2.0", "author": { "name": "wan-huiyan" }, "repository": "https://github.com/wan-huiyan/overnight-workflows", "license": "MIT", diff --git a/plugins/overnight-multi-issue-implementation/SKILL.md b/plugins/overnight-multi-issue-implementation/SKILL.md index 0d6a923..9f0724f 100644 --- a/plugins/overnight-multi-issue-implementation/SKILL.md +++ b/plugins/overnight-multi-issue-implementation/SKILL.md @@ -26,7 +26,7 @@ description: | client-delivery`), or generating insights from data (use `overnight- insight-discovery`). author: wan-huiyan + Claude Code -version: 1.1.0 +version: 1.2.0 date: 2026-05-29 --- @@ -53,7 +53,10 @@ All of these conditions: 3. **Human availability**: user is going to sleep / away for 6-10 hours; not available to merge PR1 between phases. 4. **Issue quality**: each issue has clear acceptance criteria (so per-issue - tasks are implementable + testable independently). + tasks are implementable + testable independently). **These preconditions + assume _pre-validated_ issues. If the cluster is large (>15) or the user + flags that some may be outdated/already-done, run Phase 0 triage first** + (below) — it produces the validated, scoped set this workflow then implements. 5. **Codebase familiarity**: agent has implementation context (CLAUDE.md, existing tests, conventions) — this isn't for greenfield bootstrapping. @@ -92,6 +95,20 @@ git checkout -b feat/task-N+1 ... Bucket the plan's tasks by file-overlap up front: file-disjoint sets may parallelize; same-file sets become an ordered sequence with `reset --hard origin/main` after every merge. Skipping the resync is how task N+1 silently reverts task N (see `stale-base-pr-silently-reverts-upstream-content`). +## Phase 0 (conditional): triage a large or possibly-stale backlog + +The "When to use" preconditions assume **pre-validated** issues. That breaks for a common real case: a backlog of 15+ issues filed over weeks, where intervening PRs may have resolved items **without** closing the issue, the issues describe gaps against a spec/mockup that has since moved, or the user explicitly says *"some might be outdated."* Implementing such a list naively re-does merged work — and a naive "close the done ones" pass naively **silently drops a still-live issue**, which leaves no trace. + +So when **the cluster is >15 issues, OR the user flags possible staleness, OR the issues are "drift / missing-feature vs a mockup" filed before recent redesign PRs** — run a read-only triage pass FIRST and bring a cut-line to the user before any implementation. Standalone skill: **`stale-backlog-triage-adversarial-verify`** (the essentials are inlined here so this plugin is self-contained). + +1. **Pin triage to real `origin/main`**, not a stale/feature worktree: `git fetch && git checkout -B triage-base origin/main`; assert `HEAD == origin/main`. Agents grep whatever is checked out — a stale tree yields wrong verdicts. +2. **One read-only investigator per issue** (fan out). Each reads the issue + comments, searches BOTH commit messages AND merged-PR **titles/bodies** (`gh pr list --search " in:title,body"` — partial status hides in titles like *"(#190 partial / #193)"*), then verifies against live code with its own eyes. For drift issues, **inject the mockup/spec path** + require a per-bullet status (a "page X lacks Y" issue is un-triageable without the target). +3. **Bias the verdict schema AGAINST dismissal — load-bearing.** Error costs are asymmetric: a wrong `still_valid` is caught at implementation; a wrong `already_fixed`/`outdated_superseded` silently drops real work. A dismissal verdict may be returned ONLY if the agent cites **both a merged PR# AND a concrete `file:line` in the current tree**; otherwise default to `still_valid`. Never round "probably done" up to "fixed." +4. **Adversarially verify ONLY the dismissals.** Pipe each `already_fixed`/`outdated` verdict to a second agent told to REFUTE it (re-check the cited PR + file:line; hunt for any unmet acceptance criterion — one fixed bullet ≠ issue fixed). Flip refuted dismissals back to `still_valid`/`partially_done`. Spend verification budget here, not on the keeps (keeps verify themselves when implemented). +5. **Present a cut-line; never auto-close.** Output buckets (implement-now / defer / close-as-outdated) and let the user ratify scope. The surviving "implement-now" set is the input to Phase A/B below — and if it is still >15, propose **waves** rather than forcing it into one run's PR shape. + +Worked example (2026-05-29, 36 issues): 36 investigators + adversarial-verify-on-dismissals. The adversarial pass flipped one wrongly-dismissed issue (the file was deleted, but 3 capabilities silently didn't migrate) and confirmed one genuinely done; only **1/36** was actually done. Cut-line → user picked a security/bugs wave → shipped 3 PRs (no merged work re-done, no live issue dropped). + ## Phases ```dot @@ -128,6 +145,18 @@ digraph overnight { Per task: **implementer subagent → spec-compliance reviewer → code-quality reviewer → mark complete**. Standard `subagent-driven-development` protocol. +**Parallel implementers — keep git in the controller.** When you fan implementers +out across file-disjoint PRs *concurrently* (the plan-driven variant), have each +agent **edit + run tests only, in its own isolated worktree — no git commands**; +the controller does all add/commit/push/PR/merge. Subagents are error-prone at git +mechanics (wrong-worktree `cd`, stranded branch refs, "reports complete but PR +unmerged" — see `subagent-bash-cd-wrong-worktree`, +`subagent-driven-branch-ref-froze-stranded-commits`, +`subagent-reports-complete-but-pr-unmerged`). Edit-only agents + controller-owned +git keeps the parallelism while sidestepping all of those traps; the agent's +deliverable is its edited worktree + a structured report (files changed, tests run, +baseline-diff), which the controller stages into per-issue commits. + For overnight throughput, calibrate review intensity **per-task** using the 3-tier rubric below (this generalizes the previous "two pragmatic deviations" version into a formal framework — see companion plugin `subagent-review-tier-calibration-for-overnight-pr-chains` for the standalone skill). ### Tier 1 — Full two-stage (strict `subagent-driven-development`) @@ -450,6 +479,13 @@ By morning the user should have: ## References / sister skills +- `stale-backlog-triage-adversarial-verify` — **Phase 0**: triage a large or + possibly-stale backlog against live `origin/main` with a verdict schema biased + against dismissal + adversarial verification of every dismissal, producing the + validated/scoped set this workflow implements. +- `subagent-bash-cd-wrong-worktree` / `subagent-driven-branch-ref-froze-stranded-commits` + / `subagent-reports-complete-but-pr-unmerged` — why parallel implementers should + be edit-only with controller-owned git. - `superpowers:subagent-driven-development` — the per-task protocol this skill builds on (implementer + spec-reviewer + code-quality-reviewer per task).