Skip to content

feat: /debug sub-agent escalation from /qa + recommendations in /review and /ship (v0.6.5.0)#192

Merged
garrytan merged 8 commits intogarrytan/better-processfrom
garrytan/debug-subagent-escalation
Mar 18, 2026
Merged

feat: /debug sub-agent escalation from /qa + recommendations in /review and /ship (v0.6.5.0)#192
garrytan merged 8 commits intogarrytan/better-processfrom
garrytan/debug-subagent-escalation

Conversation

@garrytan
Copy link
Owner

Summary

  • /qa now escalates stubborn bugs to /debug automatically — when a bug resists 2+ fix attempts (each reverted due to regressions), spawns a debug sub-agent with structured bug brief (symptoms, repro, failed fixes, files). Results in QA report's Debug Escalation Summary.
  • /review recommends /debug for pre-existing bugs found in the base branch (informational, no Agent spawning).
  • /ship detects reverted fix(qa): commits in branch history and suggests /debug (informational, doesn't block shipping).
  • /debug gains browse access for visual bug reproduction and fix verification.
  • P2 TODO: worktree-based parallel debug sub-agents for future parallelism.

Architecture

Sequential sub-agent: /qa finishes its fix loop (8a-8f), then spawns debug agents one at a time for reverted-twice bugs. Working tree is clean between investigations. git checkout . on BLOCKED to discard debug artifacts.

Test Coverage

  • 11 new validation assertions (Phase 8g, structured handoff, result handlers, report summary, Step 5.7, ship detection, browse setup)
  • LLM judge: escalation prompt quality eval
  • E2E: prompt-level deterministic test + full flow stub
  • Tests: 335 → 335 (+0 new files, assertions added to existing)
  • Test Coverage Audit: All new changes are template prose — no application code paths to audit.

Pre-Landing Review

No issues found.

Design Review

No frontend files changed — design review skipped.

TODOS

  • Added: Worktree-based parallel debug sub-agents (P2)

Test plan

  • All skill validation tests pass (335 tests, 0 failures)
  • Template regeneration clean (bun run gen:skill-docs)
  • CEO Review: CLEAR (selective expansion, 0 unresolved)
  • Eng Review: CLEAR (full review, 0 critical gaps)

🤖 Generated with Claude Code

garrytan and others added 8 commits March 18, 2026 11:08
Debug skill can now use the browse binary to visually reproduce bugs,
take screenshots as evidence, and verify fixes. This makes /debug
effective for web app bugs when spawned as a sub-agent from /qa.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When QA fix attempts fail twice on the same bug (reverted due to
regressions), /qa now spawns a /debug sub-agent with a structured
bug brief including symptoms, repro steps, failed fix details, and
file paths. Results are reported in Phase 10's debug escalation summary.

Sequential execution: one debug investigation at a time, working tree
cleaned between investigations. Graceful degradation on all failure
modes (BLOCKED, agent failure → deferred in report).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When /review finds what appears to be a pre-existing bug in the base
branch (not introduced by the PR's diff), it now classifies it as
INFORMATIONAL and recommends running /debug for systematic root-cause
investigation. No Agent spawning — /review's scope stays on the diff.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
During pre-landing review, /ship now checks for reverted fix(qa):
commits in the branch history and recommends /debug for systematic
investigation. Informational only — does not block shipping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Skill validation: 11 new assertions covering Phase 8g trigger, structured
handoff fields, agent result handlers, debug escalation summary, Step 5.7
recommendation, ship reverted QA detection, and debug browse setup.

LLM judge: evaluates Phase 8g template quality — structured brief format,
result handling, working tree cleanup, sequential processing.

E2E: prompt-level deterministic test (verifies escalation prompt has all
required fields) + full flow stub (fixture TODO for planted regression).

Touchfile entries for diff-based test selection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When /qa hits multiple stubborn bugs, parallel debug agents in
isolated git worktrees could investigate simultaneously. Deferred
from the sequential debug escalation PR as a follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…detection

Two new E2E tests:
- review-pre-existing-bug: plants SQL injection in base branch, verifies
  Step 5.7 classifies as INFORMATIONAL and recommends /debug
- ship-reverted-qa-commits: creates branch with reverted fix(qa): commits,
  verifies /ship detects them and recommends /debug

Also fixes qa-debug-prompt-logic to use correct workingDirectory, and
ensures test repo init uses -b main for portability.

All 4 debug-related evals pass: $0.34 total, 94s.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@garrytan garrytan merged commit 94c1530 into garrytan/better-process Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant