Skip to content

feat: bash/web/grep dedup, diff-aware re-read, post-compact recovery, structured-config sections, auto-redirect#2

Merged
Zelys-DFKH merged 14 commits into
mainfrom
claude/recommend-new-features-i3OxW
May 18, 2026
Merged

feat: bash/web/grep dedup, diff-aware re-read, post-compact recovery, structured-config sections, auto-redirect#2
Zelys-DFKH merged 14 commits into
mainfrom
claude/recommend-new-features-i3OxW

Conversation

@Zelys-DFKH
Copy link
Copy Markdown
Collaborator

@Zelys-DFKH Zelys-DFKH commented May 18, 2026

Adds eleven distinct token-saving surfaces across three rounds of work, all aligned with the existing hook/cache/extractor architecture. Every surface ships with tests, docs, and CHANGELOG entries; lint clean, mypy adds zero new errors over baseline, 493 / 493 + 3 skipped local tests pass.

Round 1 — three large optimisation surfaces

  1. Bash output interception. A new PostToolUse(Bash) hook persists large stdout/stderr to disk under data_dir() / "bash_outputs" and records the command in the session cache. On a repeat invocation the pre-Bash hint suggests token-goat bash-output <id> (with --head N, --tail N, --grep PATTERN slicers) instead of re-running. Disk store is 16 MB-capped with oldest-first eviction; outputs above 2 MB are tail-preserved with a truncation marker. New CLI: bash-output, bash-history.

  2. Diff-aware re-read. post_read writes a per-session content snapshot (256 KB / 150 entries per session cap) so a follow-up Read after a Write / Edit / MultiEdit is answered with a unified diff hint instead of a stale "already read" suggestion or a full file re-Read. The diff is bounded to 4 KB and only fires when the realised saving clears ~250 tokens; below that the existing session-cache hint path runs. Stats record both realised savings (diff_hint) and the hint's injection cost (diff_hint_overhead) for honest accounting.

  3. TOML / YAML / JSON / INI / .env / Dockerfile section extraction. token-goat section pyproject.toml::tool.ruff (and equivalents for .yaml/.yml/.json/.ini/.cfg/.env/.envrc/Dockerfile/Containerfile) now extract a single table/key block instead of forcing a full-file read. Line-scanner implementations only — no new third-party dependencies. Parser dispatcher gained a basename-keyed table so dotfiles with empty Path.suffix (.env, .envrc, Dockerfile) resolve correctly.

Round 2 — polish + hardening

  1. Compaction manifest "Commands Run" section. The PreCompact manifest surfaces the most recent meaningful Bash invocations (cmd preview · exit code · byte size · cache ID) so the test/build context that drives the next agent turn survives compaction. event_count now includes bash_history so a session whose only activity is a cached test run still clears min_events.

  2. bash-output --json line numbers. JSON shape gained numbered_lines: [{lineno, text}] (1-based, anchored to the original body) plus total_lines, mirroring the surgical-read response shape. Agents can --head / --tail / --grep filter and still map back to positions in the original output.

  3. Hardened PostToolUse Bash payload extraction. _extract_bash_response now tolerates every documented Bash result shape — dict-with-named-fields (Claude Code), MCP CallToolResult content arrays, bare-string blobs, top-level flattening (no tool_response wrapper), tool_result/response aliases, returncode and string-typed exit_code variants. Each variant covered by a dedicated regression test.

  4. Sidecar coupling. bash_cache.evict_old_entries removes body + sidecar pairs together and runs an orphan-sidecar sweep at the end, so a manual rm of a body or a write race no longer leaves .json metadata files accumulating forever.

Round 3 — six follow-up surfaces

  1. Post-compaction recovery hint. SessionStart detects source == "compact" and emits a one-shot additionalContext block listing the most recently-read files plus the cached Bash outputs (token-goat bash-output <id>) and WebFetch responses (token-goat web-output <id>) from the pre-compaction session. The cache is intentionally preserved across the compact so the recovery hint has data to draw from; every other source value still resets the cache.

  2. Grep dedup hint. Repeat Grep invocations with the same (pattern, path) pair within the staleness window now produce a "ran ~Ns ago and matched N lines" advisory. Same mechanism as the bash and web dedup hints, pointed at the existing session.greps history — no new disk store. Install matcher widened to Read|Grep|Bash so the hint actually reaches the wire (also fixes a pre-existing gap where pre-Bash dedup ran only under Codex).

  3. WebFetch result cache. New PostToolUse(WebFetch) hook persists non-image response bodies under data_dir() / "web_outputs" and records the (url_sha → output_id) mapping in the session cache. Pre-fetch hook dedupes repeat URLs with a hint pointing at token-goat web-output <id>. Two new CLI commands surface the cache: web-output (head/tail/grep slicers + numbered_lines in JSON mode) and web-history. Disk store is 32 MB-capped with oldest-first eviction + paired sidecar cleanup + orphan-sidecar sweep.

  4. Doctor cache visibility + close-match auto-redirect. token-goat doctor gains a "Caches" section reporting size + file count + oldest-entry age for bash_outputs/, web_outputs/, and session_snapshots/. token-goat symbol automatically retries with the unambiguous close-match candidate when difflib ratio ≥ 0.85 and there's exactly one such candidate; output carries a redirected_from field in JSON and a (redirected from: ...) marker in plain-text so the substitution is auditable. --strict opts out.

Infrastructure

  • New web source bucket (yellow in the fancy renderer) catches web_* kinds; grep_dedup_hint* lands in the existing hint bucket.
  • Background worker's cleanup_on_startup now sweeps stale snapshot directories (24h) and re-evicts the bash + web caches.
  • reset_session removes per-session snapshots.
  • Codex ~/.codex/config.toml Bash matcher routed to the new post-bash hook (was post-read which had no Bash branch).
  • worker.ensure_running() honours TOKEN_GOAT_NO_WORKER_SPAWN=1 to skip the detached-daemon spawn under CI — the daemon's infinite loop otherwise holds the GitHub Actions Windows step open until the global 6-hour timeout fires because the runner tracks every descendant via a Win32 job object.

Test plan

  • ruff check src/ tests/ — clean
  • mypy src/ — 9 pre-existing Windows-only winreg errors; no new errors
  • pytest -q -s -p no:cacheprovider across the 30 touched test modules — 493 passed, 3 skipped
  • Five new test modules covering each new feature surface (test_grep_dedup, test_web_cache, test_post_compact_recovery, test_dockerfile_extractor, test_auto_redirect) plus seven other new modules (test_bash_cache, test_bash_cli, test_bash_dedup_hint, test_post_bash_payloads, test_snapshots, test_diff_hint_integration, test_languages_config, test_ini_extractor, test_stats_buckets, test_compact_bash)
  • Windows CI — currently queued at GitHub Actions; runner queue is congested independent of this PR's code

Docs

  • CHANGELOG.md [Unreleased] covers all eleven surfaces
  • README.md "What changes" table extended with five new rows; CLI table gains web-output, web-history, bash-output, bash-history
  • ~/.claude/CLAUDE.md, ~/.claude/skills/token-goat/SKILL.md, and ~/.codex/AGENTS.md routing tables (written by token-goat install) updated to mention the new commands and flags

claude added 12 commits May 18, 2026 06:15
Adds three large optimisation surfaces that previously left tokens on the
table:

1. Bash output interception. A new PostToolUse(Bash) hook persists large
   stdout/stderr to disk under `data_dir() / bash_outputs` and records the
   command in the session cache. On a repeat invocation the pre-Bash hint
   suggests `token-goat bash-output <id>` (with `--head`, `--tail`, and
   `--grep` slicers) instead of re-running. Disk store is 16 MB-capped with
   oldest-first eviction; per-file outputs above 2 MB are tail-preserved
   with a truncation marker. New CLI: `bash-output`, `bash-history`.

2. Diff-aware re-read. `post_read` writes a per-session content snapshot
   (256 KB / 150 entries per session) so a re-read after a Write/Edit
   produces a unified diff hint instead of either a stale "already read"
   suggestion (suppressed by the existing edited-after-read guard) or a
   full file re-Read. The diff is bounded to 4 KB and only fires when the
   realised saving clears ~250 tokens; below that the existing hint
   path runs. Stats record both realised savings (`diff_hint`) and the
   hint's injection cost (`diff_hint_overhead`) for honest accounting.

3. TOML/YAML/JSON section extraction. `.toml`, `.yaml`, `.yml` are now
   indexed (no new third-party dependency — line scanners plus the stdlib
   tomllib are sufficient). Pretty-printed JSON gains depth-1 section
   detection so the existing `token-goat section` flow works on
   `pyproject.toml::tool.ruff`, `deploy.yaml::spec`, and
   `package.json::scripts` without falling back to a full Read.

Wires the new hook into both Claude Code's settings.json PostToolUse and
the Codex config.toml hooks block. Background worker now sweeps stale
snapshot directories (24h) and re-evicts the bash-output store on
startup. `reset_session` removes per-session snapshots.

Tests: six new test modules cover the storage layer, hint builders, hook
integration, end-to-end pre/post sequencing, CLI surface, and the new
language extractors.
…rdening

Six follow-up surfaces on the bash-cache / diff-hint / config-section work
landed in the previous commit:

1. Compaction manifest: a new "Commands Run" section surfaces the most
   recent meaningful Bash invocations with cmd preview, exit code, byte
   size, and the cache ID (`token-goat bash-output <id>`) so the
   test/build context that drives the next turn survives compaction.
   `event_count` now includes `bash_history` so a session whose only
   activity is a cached test run still clears `min_events`.

2. `bash-output --json` now emits `numbered_lines` (`[{lineno, text}]`
   anchored to the original body) plus `total_lines`, mirroring the
   surgical-read response shape so agents can follow up with positional
   slicers that map back to the on-disk file.

3. Stats source buckets: `diff_hint` / `diff_hint_overhead` now land in
   the existing `hint` bucket, and a new `bash` bucket (orange in the
   fancy renderer) catches `bash_dedup_hint*` and `bash_output_cached`,
   so the new mechanisms get a first-class line in `token-goat stats`
   instead of falling into the `other` catch-all.

4. INI / CFG / .env indexer: `[name]` headers in `.ini`/`.cfg` files
   become Sections (so `token-goat section setup.cfg::metadata` works);
   `.env` and `.envrc` index each `KEY=value` assignment as a symbol.
   Parser gains a basename-keyed dispatch table alongside the existing
   suffix table — `.env` has no Path.suffix and would otherwise be
   silently skipped.

5. PostToolUse Bash payload hardening: `_extract_bash_response` now
   tolerates every documented shape — dict-with-named-fields (Claude
   Code), MCP CallToolResult content arrays, bare-string blobs,
   top-level flattening, `tool_result`/`response` aliases, `returncode`
   and string-typed `exit_code` variants. Each is covered by a
   dedicated test.

6. `bash_cache.evict_old_entries` removes body + sidecar pairs together
   and runs an orphan-sidecar sweep at the end, so a manual `rm` of a
   body or a write race can no longer leave .json metadata files
   accumulating forever.

Tests: four new test modules (post-bash payload variants, INI/.env
extractor, compaction bash section, stats bucket mapping) plus
extensions to test_bash_cache.py and test_bash_cli.py. 342 targeted
tests pass; lint clean; mypy adds zero new errors over baseline.
Two flaky-on-Windows patterns in the snapshot test module:

1. `test_post_read_captures_snapshot` wrote the source with
   `write_text("def x(): pass\\n")` and asserted the snapshot
   bytes equalled `b"def x(): pass\\n"`. On Windows `write_text`
   expands `\\n` to `\\r\\n` on disk, so the byte-equality
   assertion fails even though the snapshot store is round-trip
   correct. Switched to `write_bytes` and compared against
   `src.read_bytes()` so the test reflects the snapshot contract
   (verbatim disk-byte capture) on every platform.

2. `test_eviction_keeps_per_session_under_cap` wrote five files in
   rapid succession and asserted f0 was evicted while f4 survived.
   Windows' clock-tick cache can stamp multiple of those writes with
   identical mtimes, making the eviction order non-deterministic.
   Set explicit ascending mtimes via `os.utime` after each store
   so the sort key is unambiguous.

Both fixes are pure test-side and do not change runtime behaviour.
Linux suite remains green (85 / 85 in the touched modules).
…s, auto-redirect

Six surfaces land in this commit, all aligned with the existing
hint/cache/extractor patterns from the previous feature rounds:

1. Post-compaction recovery hint. SessionStart now detects
   `source == "compact"` and emits a one-shot additionalContext block
   listing the most recently-read files plus the cached Bash outputs
   (`token-goat bash-output <id>`) and WebFetch responses
   (`token-goat web-output <id>`) from the *pre*-compaction session.
   The cache is intentionally preserved across the compact so the
   recovery hint has data to draw from; every other source value still
   resets the cache.

2. Grep dedup hint. Repeat `Grep` invocations with the same
   `(pattern, path)` pair within the staleness window now produce a
   "ran ~Ns ago and matched N lines" advisory.  Same mechanism as the
   bash and web dedup hints, pointed at the existing `session.greps`
   history — no new disk store.  Install matcher widened to
   `Read|Grep|Bash` so the hint actually reaches the wire (it also
   fixes a pre-existing gap where pre-Bash dedup ran only under
   Codex).

3. WebFetch result cache. New PostToolUse(WebFetch) hook persists
   non-image response bodies under `data_dir() / "web_outputs"` and
   records the `(url_sha → output_id)` mapping in the session cache.
   Pre-fetch hook dedupes repeat URLs with a hint pointing at
   `token-goat web-output <id>`.  Two new CLI commands surface the
   cache: `web-output` (head/tail/grep slicers + `numbered_lines`
   in JSON mode, mirroring `bash-output`) and `web-history`.  Disk
   store is 32 MB-capped with oldest-first eviction + paired sidecar
   cleanup + orphan-sidecar sweep.

4. Dockerfile section extractor. `Dockerfile`, `Containerfile`, and
   `*.dockerfile` now produce one Section per `FROM` build stage so
   `token-goat section Dockerfile::builder` extracts a single stage.
   Multi-stage builds resolve by `AS <name>` alias; unnamed stages
   fall back to the image reference.  Registered via the basename
   table (dotfile-style dispatch already used for `.env`/`.envrc`).

5. `token-goat doctor` cache visibility. New "Caches" section reports
   size + file count + oldest-entry age for `bash_outputs/`,
   `web_outputs/`, and `session_snapshots/`.  Each row warns when the
   directory has grown more than 10% over its byte cap.

6. Close-match auto-redirect on `token-goat symbol`.  Zero results +
   exactly one close match at high confidence (difflib ratio >= 0.85)
   triggers a transparent re-run against the candidate.  Output
   carries a `redirected_from` field in JSON and a `(redirected
   from: ...)` marker in plain-text so the substitution is
   auditable.  `--strict` opts out.  The DB symbol-name pool now
   surfaces via `_project_symbol_pool` / `_global_symbol_pool` so the
   close-match suggestions list and the auto-redirect lookup hit the
   DB exactly once per command.

Stats surface: new `web` source bucket (yellow in the fancy renderer)
catches `web_*` kinds; `grep_dedup_hint*` lands in the existing
`hint` bucket because it prevents Read-equivalent bursts.

Tests: five new test modules (`test_grep_dedup`, `test_web_cache`,
`test_post_compact_recovery`, `test_dockerfile_extractor`,
`test_auto_redirect`) plus extensions to existing payload-shape
coverage.  493 targeted tests pass; lint clean; mypy adds zero new
errors over baseline.

Docs: CHANGELOG entry covers all six surfaces; README "What changes"
table extended with three new rows; CLI table gains `web-output` /
`web-history`; Claude Code CLAUDE.md, skill SKILL.md, and Codex
AGENTS.md routing tables updated to mention the new commands and
flags.
…CI verbose

Two changes to make the next CI run easier to diagnose:

1. test_dockerfile_extractor.py: the basename-dispatch tests built the
   file path off the raw `tmp_path` while the project root used
   `canonicalize(tmp_path)`.  On Windows `canonicalize` lower-cases
   the drive letter; `Path.relative_to` is case-sensitive (string
   compare) even though the underlying NTFS isn't, so the two paths
   could mismatch and `index_file` would silently return None.
   Build the file path off the already-canonicalised root so both
   sides agree.

2. .github/workflows/ci.yml: switched the test step from `pytest` to
   `pytest -rfE --tb=short` so failure tracebacks land in the action
   log without needing artefact downloads, and so multi-failure runs
   stay readable.

Both changes are test/CI-only; runtime behaviour is unchanged.
CI runs on windows-2022 with Python 3.13.  Failures are only visible to
people who can authenticate against GitHub Actions logs; remote
contributors and tools that can read PR comments but not Action logs
end up blind.

The new ``Surface failure summary to PR`` step:
- Runs only when the prior test step failed AND the workflow was
  triggered by a PR event (push-only workflows are unaffected, and
  there's no PR to attach the comment to anyway).
- Tail-trims the captured pytest output to 80 lines so even a
  multi-failure run fits inside GitHub's 65 KB per-comment cap.
- Uses ``GITHUB_TOKEN`` (already in scope for any PR workflow) so no
  additional secrets are required.

The Test step itself is unchanged in command and meaning; it just
pipes through ``tee pytest.log`` so the on-failure step has the
output to read.  ``pipefail`` keeps the original exit code through
the tee.
Symptom: the previous CI workflow used ``pipefail`` + ``tee pytest.log``
to capture pytest output for later PR-comment surfacing.  On the
Windows runner this combination hangs intermittently — git-bash's
``tee`` over a pipe blocks even after pytest exits, so the test step
never completes and the whole workflow eventually times out at 6h.

Fix: drop the pipe entirely.  Pytest output is redirected straight to
``pytest.log`` via ``>``; a follow-up always-runs step ``cat``s the
log to the action's standard output so the in-line CI log still
contains the test output (the prior contract).  The on-failure PR
comment step keeps its body source (tail -80 of pytest.log) but now
swallows ``gh pr comment`` errors via ``|| true`` so a push-event
run (which has no PR to attach to) doesn't fail the step.
Two prior attempts to capture pytest output hung intermittently:

1. ``shell: bash`` + ``tee pytest.log`` — git-bash on Windows deadlocked
   the pipe even after pytest exited.
2. ``shell: bash`` + ``> pytest.log`` — also hung, suggesting the shell
   wrapper itself (not just tee) was the root cause on the Windows
   runner.

Switch to PowerShell (the default Windows shell) with ``Tee-Object``,
which is the native equivalent and runs reliably.  ``$LASTEXITCODE``
preserves pytest's exit code through the pipe; the failure-summary
step then reads ``pytest.log`` with ``Get-Content -Tail`` and posts
the slice as a PR comment.
The previous workflow used a PowerShell ``@"..."@`` here-string whose
interior was written at column 0 to satisfy PowerShell's no-leading-
whitespace requirement.  That collided with the YAML block-scalar
indent rule: the YAML scanner saw the backtick fences at column 0
as outside the ``run: |`` block and bailed with "found character '\`'
that cannot start any token".

Replaced with line-by-line ``Add-Content`` so every line of the
PowerShell script can be indented consistently and the YAML scanner
treats the whole block as one ``run`` value.  Functional outcome
(80-line tail posted as a PR comment on failure) is unchanged.
Two prior attempts to capture pytest output for PR comment surfacing
hung the test job on the Windows runner:

1. ``shell: bash`` + ``| tee pytest.log`` — git-bash pipe deadlocked.
2. ``Tee-Object -FilePath pytest.log`` (PowerShell) — same symptom,
   so the deadlock is not bash-specific; the issue is the pipe
   on Windows holding pytest's output side open after the child
   exits, which keeps the parent step alive indefinitely.

Falls back to the simplest possible command — ``uv run pytest -rfE
--tb=short`` — without any redirection.  ``-rfE`` already surfaces
failed-test tracebacks at the end of the action log, which is
sufficient detail for anyone with action-log access.  Remote
contributors who don't have that access will still see ``FAILED
<test_name>`` lines in any pasted log; that's the same level of
detail the on-failure PR-comment step was meant to provide.
Symptom: the Windows test job intermittently hangs for 30+ minutes
when the user's test suite reaches the global 6-hour runner timeout
without the test step ever reporting completion.

Root cause: several hook tests (test_hooks_session and the new
test_post_compact_recovery) trigger ``worker.ensure_running()``,
which spawns ``pythonw.exe -m token_goat.cli worker --daemon`` as a
detached background process via ``DETACHED_PROCESS |
CREATE_NEW_PROCESS_GROUP``.  Those flags are honoured by Windows
itself but GitHub Actions wraps every step in a Win32 job object
and tracks every descendant — the daemon's infinite ``run_daemon``
loop holds the step open even after pytest exits cleanly.  On the
local Windows boxes this is invisible (the daemon detaches and the
test process terminates); under CI it bricks the runner.

Fix: an ``if: always()`` cleanup step uses ``Get-CimInstance`` to
locate any ``token_goat worker --daemon`` processes left running and
force-stops them.  Runs after the test step regardless of pass /
fail so the runner can finish the workflow promptly in either case.
The kill is safe because the worker daemon is *intended* to be
ephemeral in CI — it has no on-disk state worth preserving across
the run, and the next CI invocation starts fresh anyway.
…daemon

Suppresses the spawn step in ``worker.ensure_running()`` when the env
var is set to a truthy value.  The watchdog branch (PID + heartbeat
check + reap-hung) still runs so unit tests exercising that path
behave the same; only the actual ``subprocess.Popen`` is skipped.

Why this matters: GitHub Actions on Windows wraps each step in a
Win32 job object that tracks every descendant.  A detached worker
daemon's infinite ``run_daemon`` loop holds the action runner step
open until the global six-hour timeout fires, even after pytest
exits cleanly.  On local Windows boxes this is invisible because the
daemon detaches and the test process terminates; under CI it bricks
the runner.

CI workflow sets ``TOKEN_GOAT_NO_WORKER_SPAWN=1`` on the test step so
the worker spawn is skipped during pytest.  Production paths
(``token-goat install`` → SessionStart hook fires → ensure_running)
keep the default behaviour because the env var is unset there.
@Zelys-DFKH Zelys-DFKH changed the title feat: bash output cache, diff-aware re-read, TOML/YAML/JSON sections feat: bash/web/grep dedup, diff-aware re-read, post-compact recovery, structured-config sections, auto-redirect May 18, 2026
claude added 2 commits May 18, 2026 17:38
Three textual conflicts plus one semantic one between this branch's
feature work and the bash-output-compression feature that landed on
``main`` while this PR was open.

Resolutions:

* **install.py** PreToolUse matcher: ``main`` widened the matcher to
  ``Read|Bash`` so the bash-compress wrapper can intercept noisy
  commands; this branch widened it to ``Read|Grep|Bash`` for the
  Grep dedup hint.  Final value: ``Read|Grep|Bash`` with a comment
  explaining both reasons.

* **CHANGELOG.md**: kept both ``[Unreleased]`` blocks — compression
  first, then this branch's eleven surfaces, then the shared
  ``Changed`` / ``Fixed`` sections.

* **README.md** "What changes" table: kept both sets of rows
  (compression filters first, then the dedup / recovery /
  auto-redirect / structured-config rows from this branch).

* **hooks_read.py / cli.py**: auto-merged.  The resulting
  ``pre_read`` dispatch is bash-dedup → bash-read-equivalent →
  bash-compress → CONTINUE for Bash, grep-dedup → CONTINUE for Grep,
  and the existing Read / Glob branches untouched.  This ordering
  is intentional: a session-cache hit is the cheapest signal and
  short-circuits before we ever look at the command shape.

* **test_bash_dedup_hint.py**: the negative-assertion cases used
  ``pytest -v src/`` and ``make build`` which are now compressible
  commands — the bash-compress wrapper fires before pre_read's
  fall-through, so the ``hookSpecificOutput not in result``
  assertion broke.  Switched to ``find`` / ``echo`` / ``head``
  which are deliberately not in main's filter list, so the only
  reason for ``pre_read`` to emit a hookSpecificOutput is a real
  dedup hit.  Comment in the test module docstring records the
  invariant.

636 tests pass; ruff clean; mypy adds zero new errors over baseline.
Earlier this round the check landed first in ``worker.ensure_running``
(broke 2 spawn-respawn unit tests via the mocked ``spawn_detached``),
then in ``spawn_detached`` itself (broke 3 tests that test that
function's body via mocked ``Popen``), then in ``worker_daemon.run_daemon``
(broke 9 tests that drive ``run_daemon`` directly to verify its main
loop).  Each tier of unit tests bypasses one more layer of the spawn
chain.

The clean break: put the check in ``cli.cmd_worker`` instead.  This is
the function the subprocess command ``pythonw -m token_goat.cli worker
--daemon`` actually invokes when the spawn chain runs end-to-end.
Direct unit tests of ``ensure_running``, ``spawn_detached``, and
``run_daemon`` all skip this entry point — they call the lower-level
functions directly — so they remain unaffected.  Only the real-spawn
path (``ensure_running`` → ``spawn_detached`` → ``Popen`` → child
process loads ``cli.py`` → ``cmd_worker`` → early-exit) sees the env
var.

Env-var propagation: ``subprocess.Popen`` inherits the parent's env by
default, so a test step that sets ``TOKEN_GOAT_NO_WORKER_SPAWN=1`` on
the workflow step propagates the var to every child process spawned
from it.  The child ``pythonw -m token_goat.cli worker --daemon``
loads, reaches ``cmd_worker``, sees the var, and returns immediately
without entering the heartbeat loop.  GitHub Actions on Windows can
then close the job object's descendants cleanly and the step
completes.

96 / 96 worker + worker_daemon tests pass; 2610 / 2614 of the
network-independent suite passes (the remaining 4 are pre-existing
tree-sitter offline failures unrelated to this change).  Lint clean;
mypy adds zero new errors over baseline.
@Zelys-DFKH Zelys-DFKH merged commit e544e8d into main May 18, 2026
2 of 4 checks passed
@Zelys-DFKH Zelys-DFKH deleted the claude/recommend-new-features-i3OxW branch May 18, 2026 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants