fix: pentest round-4 — 3 walker coverage gaps (sec 66-68)#26
Merged
Conversation
gawk's `-i inplace` (and `--include=inplace`) loads the inplace
extension library and rewrites every FILE positional in place via
temp-file + rename — same write semantics as `sed -i` / `perl -i`.
The existing inplace walker covered sed/perl/ruby but not awk, so
`awk -i inplace '{print}' /etc/foo` slipped through and silently
truncated/rewrote /etc/foo.
Round-4 pentest discovered the gap. Plain `awk PROG file` is
read-only and stays ALLOWED; the new walker engages only when the
inplace library is actually loaded:
awk -i inplace 'PROG' file (separate-arg form)
awk -iinplace 'PROG' file (attached short form)
awk --include=inplace 'PROG' file (long form)
gawk -i inplace 'PROG' file (canonical binary)
awk -i inplace -f script.awk file (-f → no positional PROG)
Mirrors the perl/ruby walker: skips first positional as PROG when
no `-f` / `-E` / `--source=` is present, consumes flag/value pairs
(-F / -v / -f / -i / -l / -E / --field-separator / --assign /
--file / --include / --load / --exec / --source), then validates
each remaining FILE via is_write_permitted.
True-negatives that must stay ALLOWED:
- `awk '{print}' /etc/passwd` (read-only)
- `awk -i ordchr '{print}' /etc/passwd` (different library)
- `awk -i inplace '{print}' $PROJECT/file.txt` (inside project)
Tests: 6 reproducers + 3 true-negatives in
tests/test_bypass_reproducers_pentest_b.sh sec 66.
1167/1167 pass.
7-Zip (`7z`, `7za`, `7zr`, `7zz`) lacked the boundary check that
`tar -C` and `unzip -d` already enforce. Two write paths slipped
through:
(a) `7z x -o<dir>` and `7z e -o<dir>` extract into <dir>;
<dir> is ATTACHED to the flag (no space — 7z spec). The
existing extract_option_values helper only matches separated
`-o VAL` / `--output=VAL`, so the attached form bypassed
every walker entirely.
(b) `7z a <archive>`, `7z u <archive>`, `7z d <archive>`,
`7z rn <archive>` create / modify <archive>; the path was
never validated.
Round-4 pentest discovered the gap. Plain `7z l` / `7z t` are
read-only and stay ALLOWED. Extraction without `-o<dir>` writes
into cwd; cwd is already bounded by the project-boundary check
on EFFECTIVE_CWD, so no extra walker is needed there.
Fix: in detectors/write_targets.sh, after the unzip/cpio walkers,
add a 7z walker that:
- Pass 1: scan CMD_TOKENS_SCAN for `-o<dir>` attached form
(`-o?*`, i.e. token starting with `-o` and longer than 2
chars), validate <dir> via is_write_permitted.
- Pass 2: locate the verb (first non-flag positional after the
binary). For write-mode verbs (a / u / d / rn), validate the
next non-flag positional as the ARCHIVE write target.
True-negatives that must stay ALLOWED:
- `7z l foo.7z` / `7z t foo.7z` (read-only)
- `7z x -o$PROJECT/extracted foo.7z` (attached, in-project)
- `7z a $PROJECT/foo.7z file1` (in-project archive)
Tests: 8 reproducers + 4 true-negatives in
tests/test_bypass_reproducers_pentest_b.sh sec 67.
1179/1179 pass.
`trap CMD SIG` registers CMD as a shell handler that fires on the named signal (EXIT, INT, TERM, ...). CMD is arbitrary shell code stored on argv and executed later — same un-inspectable semantics as `bash -c CMD` and `eval`. The existing block_nested_shell_and_eval walker only matched the explicit `(bash|sh|...) -c` form, so `trap "rm /etc/foo" EXIT` slipped through and silently armed a deferred destructive op. Cannot inline-inspect CMD safely: it can chain (`echo a; rm b`), expand (`$(...)`, `$VAR`), or call out to a script. Same fail-closed rule as `bash -c` and `eval`. Read-only forms that must stay ALLOWED: trap (list current traps) trap -l (list signal names) trap -p (print traps in re-input form) trap - SIG (reset to default — positional is `-`) trap '' SIG (ignore — positional is empty after strip_quotes) Round-4 pentest discovered the gap. Fix: extend block_nested_shell_and_eval. command_name_is "trap" anchors to the verb position so `git commit -m "fix trap handling"` doesn't false-positive on the substring. Walker consumes `--`, `-l`, `-p` (the latter two flip read-only mode), then inspects the first positional. Block iff it's non-empty AND not the literal `-`. Tests: 6 reproducers + 5 true-negatives in tests/test_bypass_reproducers_pentest_b.sh sec 68. 1190/1190 pass.
Codex round-4 review of PR #26 found three argument-form gaps in the sec-66 awk inplace walker: (a) `gawk --inplace` / `gawk --inplace=...` — long form not recognised by the inplace marker case. (b) `awk --include inplace` (split, no `=`) — pre-scan only checked the attached `--include=inplace` form, so the space-separated form bypassed the trigger. (c) `mawk -i inplace` / `nawk -i inplace` — trigger regex anchored on `(awk|gawk)` only, while the adjacent awk inline-code detector (block_interpreter_inline_code) uses `g?awk|mawk|nawk`. Misalignment is a fail-open seam. Mainline gawk doesn't ship `--inplace` (only `-i inplace`), and mainline mawk/nawk don't ship the inplace extension at all. Both inclusions are fail-closed: if a fork or future build accepts those forms, the boundary check still holds. Fix: - Trigger regex: `(awk|gawk)` → `(awk|gawk|mawk|nawk)`. - Pre-scan `--include` next-token like `-i` next-token (set `awk_inplace=1` when next equals literal `inplace`). - Add `--inplace` / `--inplace=*` to the inplace marker case. True-negatives that must stay ALLOWED: - `nawk '{print}' /etc/passwd` (read-only) - `awk --include ordchr '{print}' /etc/passwd` (different lib) - `gawk --inplace '{print}' $PROJECT/file.txt` (in-project) Tests: 5 reproducers + 3 true-negatives in new tests/test_bypass_reproducers_pentest_c.sh sec 69. test_guard.sh sources the new file. 1198/1198 pass.
Codex round-4 review of PR #26 found that sec 67 only validated the attached form `-o<dir>` (token matching `-o?*`, len > 2). The binary in fact also accepts the space-separated form `-o <dir>` on most builds, so `7z x archive.7z -o /tmp/dir` slipped past every walker. A fail-closed boundary check should not depend on documented vs. accepted CLI surface — extend pass 1 to also handle the split form: when a token equals exactly `-o`, look at the next token and validate it as the extraction destination via the same `expand_path` / `is_write_permitted` pipeline. Tests: 3 reproducers + 2 true-negatives in tests/test_bypass_reproducers_pentest_c.sh sec 70. 1203/1203 pass.
Codex round-4 review of PR #26 found that `-w[<path>]` selects 7z's temp/work directory and 7z creates / writes intermediate files there during archive operations. An outside-project value is a real boundary violation, analogous to `-o<dir>` — sec 67 didn't validate it at all. Both attached (`-w/tmp/work`) and split (`-w /tmp/work`) forms are covered. Bare `-w` with no value tells 7z to use the system default — read-only with respect to the project boundary, left ALLOWED. Fix: add Pass-1b to the 7z detector after the `-o<dir>` pass. Same shape as Pass-1: when token equals exactly `-w`, look at the next token and validate via `is_write_permitted`; otherwise strip the `-w` prefix from `-w?*` tokens. Tests: 3 reproducers + 2 true-negatives in tests/test_bypass_reproducers_pentest_c.sh sec 71. 1208/1208 pass.
Codex round-4 review of PR #26 found two thin gaps in the 7z detector that surfaced together: (a) `7zzs` is the 7-Zip 24+ static-link binary; sec 67's trigger regex was `(7z|7za|7zr|7zz)` and missed it. Every pass (extract dest, working-dir, archive verb) was skipped entirely on `7zzs`. (b) The archive-target pass treated every `-*` token as a flag (skip). After a POSIX `--`, the next token — even if it starts with `-` — is a positional operand. Real-world exploitability is narrow (absolute paths can't start with `-`), but fail-closed alignment with the sed -i / tar walkers (which already track `seen_dashdash`) is the right call. Fix: - Trigger regex: `(7z|7za|7zr|7zz)` → `(7z|7za|7zr|7zz|7zzs)`. - Archive-target pass: track `zai_seen_dashdash`; once `--` is encountered, advance past it and stop the `-*` skip — validate the very next token as ARCHIVE. Tests: 4 reproducers + 3 true-negatives in tests/test_bypass_reproducers_pentest_c.sh sec 72. 1215/1215 pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Round-4 pentest discovered 3 walker coverage gaps in the project-boundary guard. Each is fixed with a dedicated commit (TDD flow: failing reproducer first, patch, regression green).
awk -i inplace 'PROG' /etc/foo— gawk in-place edit (rewrites FILE positionals via temp+rename, same assed -i)inplace.sh7z x -o/etc foo.7z(attached-o<dir>) +7z a /etc/foo.7z file(write-mode verbsa/u/d/rn)write_targets.shtrap "rm /etc/foo" EXIT— deferred shell handler (un-inspectable CMD on argv, same asbash -c)shell_exec_walkers.shTest plan
All files stay under 500 lines (largest: write_targets.sh 415).
Generated with Claude Code.