Stop hook: skip read-only turns and run CI poll asynchronously#17
Open
gnguralnick wants to merge 1 commit into
Open
Stop hook: skip read-only turns and run CI poll asynchronously#17gnguralnick wants to merge 1 commit into
gnguralnick wants to merge 1 commit into
Conversation
Two related sources of noise in the stop-hook orchestrator make the
lead-proxy / multi-worker pattern painful:
1. Synchronous CI poll. The orchestrator waits up to ci.timeout (default
600s) for CI checks to complete. While it waits, the agent's tmux
slot is occupied and cannot process queued user messages.
Fix: launch the CI poll detached (nohup + disown + stdio redirection)
and return immediately. The orchestrator records the SHA + PID it
launched the poll for, and surfaces the poll's outcome to the *next*
turn that lands on the same commit. If HEAD has moved, a stale poll
is terminated and a fresh one is launched. Removes the 'pending'
reset from ensure-pr so prior-turn results survive the next ensure-pr.
2. Read-only turns trigger the full pipeline. A turn that only used
Read/Glob/Grep/LS produces no code-affecting work, but the orchestrator
still runs fetch+merge, push, ensure-pr, and review gates.
Fix: parse transcript_path from the hook's stdin JSON, walk the
transcript with a small Python helper to enumerate tool_use names
used since the last human user turn, and exit 0 immediately if all
of them are in {Read, Glob, Grep, LS} (or empty). Gated on the new
stop_hook.skip_readonly_turns config knob (default true). Falls
through to the normal pipeline if the transcript is missing or
unparseable.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two stop-hook noise sources that make the lead-proxy / multi-worker pattern painful:
Synchronous CI poll blocks the message slot for up to 10 minutes.
The orchestrator currently waits on
poll_pr_checks.sh(timeout 600s, interval 15s)in parallel with the gates. While it waits, the agent's tmux slot is occupied and
cannot process queued user messages -- in the lead-proxy pattern, that turns 30s
gate-approval iterations into 10-minute ones. The only workaround so far has been
to manually
kill <pid>the poll, which silently fails the hook.Fix: make the poll fire-and-forget. The orchestrator now spawns the poll
detached via
nohup + disown + stdio redirection, records the SHA + PID it waslaunched for, and surfaces the poll's outcome to the next turn that lands on
the same commit. If HEAD has moved between turns, the stale poll is terminated
and a fresh one is started. The
pendingreset previously written byensure-pris removed so that the prior-turn
pr_statussurvives across ensure-pr calls.Stop hook fires even on Read-only turns.
In a recent session, a stop-hook ran the full pipeline after a turn that contained
~21 tool calls -- all
Read. Read-only turns produce no code-affecting work; thepipeline (fetch+merge, push, ensure-pr, review gates) is wasted churn.
Fix: parse
transcript_pathfrom the hook's stdin JSON, enumeratetool_usenames since the last human user turn via a small Python helper, and exit 0 if
they're all in
{Read, Glob, Grep, LS}(or empty). Gated on a newstop_hook.skip_readonly_turnsconfig knob (defaulttrue). Falls through tothe full pipeline if the transcript is missing or unparseable, so this is safe.
Test plan
Manually exercised each branch with a constructed transcript and a sandboxed copy
of the orchestrator:
immediately with
Read-only turn (...); skipping all gateslog entry.branch and proceeds into the rest of the pipeline.
reparented to PID 1, state files written (
pr_status_pid,pr_status_sha,pr_status=pending).with the original error text, no new poll launched.
pr_status_pid(only ifthe file still matches its own PID).