Skip to content

bash output compression: 12 per-tool filters, failures-first, ~80-97% smaller#3

Merged
Zelys-DFKH merged 2 commits into
mainfrom
claude/compress-command-output-p30fI
May 18, 2026
Merged

bash output compression: 12 per-tool filters, failures-first, ~80-97% smaller#3
Zelys-DFKH merged 2 commits into
mainfrom
claude/compress-command-output-p30fI

Conversation

@Zelys-DFKH
Copy link
Copy Markdown
Collaborator

PreToolUse hook on Bash detects compressible commands and rewrites them to
flow through token-goat compress, which runs the original through the
system shell, captures stdout + stderr, applies the per-tool filter, and
emits a failures-first compressed view that strips progress bars, dedupes
warnings, groups linter issues by rule, and keeps every error block
verbatim. Built-in filters cover pytest, jest/vitest, cargo, npm/pnpm/yarn,
docker/buildah/podman, kubectl/helm, aws, ruff/eslint/mypy/pyright/tsc,
git, make/ninja/gradle/mvn/go, terraform, and pip. Per-stream capture is
capped at 32 MiB, total output at 1000 lines / 64 KiB; the wrapper
preserves the original exit code and kills the entire process group on
timeout. Configurable via [bash_compress] in config.toml or disabled
entirely with TOKEN_GOAT_BASH_COMPRESS=0.

Also fixes three pre-existing bugs:

  • paths.open_log_file claimed to return FileHandler but returned a bare
    StreamHandler on POSIX, breaking the worker isinstance check.
  • Two test_canonicalize_* tests asserted Windows-shell normalisation
    invariants that only hold on Windows; now skipped on POSIX.

claude added 2 commits May 18, 2026 06:43
… smaller

PreToolUse hook on Bash detects compressible commands and rewrites them to
flow through `token-goat compress`, which runs the original through the
system shell, captures stdout + stderr, applies the per-tool filter, and
emits a failures-first compressed view that strips progress bars, dedupes
warnings, groups linter issues by rule, and keeps every error block
verbatim. Built-in filters cover pytest, jest/vitest, cargo, npm/pnpm/yarn,
docker/buildah/podman, kubectl/helm, aws, ruff/eslint/mypy/pyright/tsc,
git, make/ninja/gradle/mvn/go, terraform, and pip. Per-stream capture is
capped at 32 MiB, total output at 1000 lines / 64 KiB; the wrapper
preserves the original exit code and kills the entire process group on
timeout. Configurable via `[bash_compress]` in config.toml or disabled
entirely with `TOKEN_GOAT_BASH_COMPRESS=0`.

Also fixes three pre-existing bugs:
- paths.open_log_file claimed to return FileHandler but returned a bare
  StreamHandler on POSIX, breaking the worker isinstance check.
- Two test_canonicalize_* tests asserted Windows-shell normalisation
  invariants that only hold on Windows; now skipped on POSIX.
…ashes

Windows CI was failing on os.killpg / os.getpgid / signal.SIGKILL attribute
checks because mypy treats them as POSIX-only. Extracted the kill body to
_posix_kill_tree which looks the attributes up with getattr so the symbol
lookups don't fire on Windows.

Also went through new files and replaced inline em-dashes with regular
punctuation, and tightened a few of the more marketing-flavored phrases
in the changelog and docstrings.
@Zelys-DFKH Zelys-DFKH merged commit 1e96258 into main May 18, 2026
4 checks passed
@Zelys-DFKH Zelys-DFKH deleted the claude/compress-command-output-p30fI branch May 18, 2026 16:18
Zelys-DFKH added a commit that referenced this pull request May 25, 2026
…xecutor

- #2: _is_git_repo cheap stat check (accepts .git dir OR worktree
  .git file) before each call to _get_uncommitted_changes /
  _get_git_diff_stat_summary. Process-local cache. Saves ~60-100 ms
  per non-git-cwd manifest build; zero cost on git repos.
- #3: ThreadPoolExecutor removed — two git ops now sequential. Saves
  ~3-8 ms of executor overhead per manifest. Existing per-call TTL
  caches still warm on second invocation.

New TestIsGitRepo (4) + TestNonGitShortCircuit (4) tests; existing
git-stat tests patched with _is_git_repo mocks.
Zelys-DFKH added a commit that referenced this pull request May 25, 2026
…ts/cold sections

- Item #3: _format_grep_entry emits bare (N) instead of (N results);
  ~1 token saved per grep entry, semantics unchanged.
- Item #4: _format_glob_entry uses the ASCII g: marker instead of the
  multibyte 📂 emoji and drops the scope path when it equals the session cwd.
- Item #5: _get_session_commits lines no longer prepend - ; the section
  is already inside a bulleted block, so the prefix was pure overhead.
- Item #10: small (<1KB) untruncated bash outputs render (e=N) without the
  byte count; truncated/large outputs keep the full metadata.
- Item #11: the Cold Outputs header collapses from a full H3 + parenthesised
  recall hint into the bold-label **Cold:** evict, recall via … form.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants