feat(security): SEC009 language-aware dispatch (PR #12)#12
Merged
Conversation
Round 9 validation surfaced that SEC009's `last == "md5" | "sha1"` matcher only fired on Python `hashlib.md5(...)` and never on Go, Java, C#, PHP, or Node — exactly the same multi-language coverage gap PR #10 fixed for SEC010. Per-language matchers, mirroring PR #10's architecture: - **Python**: `hashlib.md5` / `hashlib.sha1`. Receiver-anchored, excludes `Crypto.Hash.MD5.new` from PyCryptodome (out of scope for now). - **Node / JS**: `crypto.createHash('md5'|'sha1'|'sha-1')`. The algorithm lives in the first string arg; new `first_arg_is_weak_alg_string()` helper inspects the literal. - **Go**: `md5.Sum` / `md5.New` / `sha1.Sum` / `sha1.New` with Layer 1 import resolution against `crypto/md5` and `crypto/sha1`. Round 9's `h := md5.Sum([]byte(password))` case. - **Java / Kotlin**: `MessageDigest.getInstance("MD5")` (string arg), Apache Commons `DigestUtils.md5Hex` / `sha1Hex`. Round 9's `MessageDigest.getInstance("MD5")` case. - **C#**: `MD5.Create()` / `SHA1.Create()` / `MD5CryptoServiceProvider` / `MD5Managed` and SHA1 equivalents. Receiver-anchored to avoid matching `SHA1024` / `SomeMD5Field`. - **PHP**: global `md5(...)` / `sha1(...)`, plus `hash('md5', ...)` / `hash('sha1', ...)` (string arg). `hash('sha256', ...)` stays silent. Also fixes `enclosing_security_context` with the same improvements PR #11 applied to `enclosing_token_context`: - Multi-language assignment node kinds (Go `short_var_declaration`, Java `local_variable_declaration`, etc.). - Function-name needle check at function-shape level. Round 9 case: `func HashPassword(password string)` calls `md5.Sum` via local `h :=`; the function name carries the `password` needle even though the local assignment doesn't. - Walks past inner blocks (don't break at for / if body, only at function shape). Tests: 8 new — Node createHash md5 (positive) / sha256 (negative), Go md5.Sum + crypto/md5 import, Java MessageDigest.getInstance MD5 (positive) / SHA-256 (negative), PHP md5 (positive) / hash sha256 (negative), C# MD5.Create. 164 / 164 tests pass. Live MCP confirmation: Round 9's starting-go/auth.go and starting-java/Auth.java now both fire all three planted findings (SEC009, SEC010, SEC012) instead of 2 / 1 respectively. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
wei9072
added a commit
that referenced
this pull request
May 7, 2026
Adds the empirical validation work that drove PRs #6 — #12 into the repo as a reproducibility archive under `experiments/`. Contents: - `comparison-report.md` (1199 lines, rolling Round 1 → Round 9 analysis). Documents the three Aegis ROI mechanisms surfaced from data: 1. Rule-hit → fix (brownfield Plan B: 0/3 → 3/3 across 3 models) 2. Structural guardrail (cycle / public_symbol_removed — 0/14 hits, dead weight on clean architectures) 3. Anti-paralysis ritual (weak models complete tasks they would otherwise abandon) - 4 starting-code fixtures (Python brownfield, Go brownfield, Java brownfield, Python multi-module). - 11 prompt files. Each task has paired `-a.txt` (no Aegis) and `-b.txt` (with Aegis MCP + REQUIRED-workflow ritual instruction). - 52 round directories: per-model deliverables + `run.log` for codex-driven rounds. Naming: <model>-<task>-<a|b>. - `aegis_validate.py` — Python wrapper around aegis-mcp stdio JSON-RPC, used by agents to run validation after each write. - 3 eval scripts that diff each round's deliverables against the planted SEC bugs. Excluded via `.gitignore` and rsync filter (would have been ~970MB of bloat): venv / .venv / __pycache__ / .pytest_cache / .toolchain (Go toolchain copies codex downloads) / compiled binaries / git metadata of nested repos. Final archive: 17MB. Direct lineage from this archive into Aegis code: | Round | Surfaced | Fixed in | |---|---|---| | Round 8 codex | SEC010 FP on `secrets.choice` | PR #9 | | Round 9 Go / Java | SEC009 multi-language coverage = 0 | PR #12 | | Round 9 Java | SEC010 inner-block `break` hid production case | PR #11 | Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
wei9072
added a commit
that referenced
this pull request
May 7, 2026
#14) - Main README.md and README.zh-TW.md: new "Experiments archive" / 「實驗資料」 section linking to experiments/README.md - experiments/README.md: replaces the prose-only intro with four analysis charts: 1. Round structure overview (52 dirs, 26 paired, 11 models) 2. Plan B brownfield 0/3 → 3/3 fix-rate matrix with per-model remaining-bug counts (real numbers from re-validating the archive against current rule library) 3. Plan C multi-module task-completion matrix surfacing the anti-paralysis ROI mechanism (g54mini-mc-a abandoned vs g54mini-mc-b completed same task) 4. Mermaid flowchart of the three Aegis ROI mechanisms (rule-hit → fix; structural guardrail; anti-paralysis ritual) - Plus a "Direct lineage" chart connecting each experiment finding to the PR that fixed it (Round 8 → PR #9; Round 9 → PR #11/#12). Also re-imports `experiments/31flashlite-amb-b/` as plain files — the previous commit accidentally captured it as a gitlink because codex had created a nested `.git/` inside the round dir during the agent run. Removed the nested `.git/` and re-added the 6 actual deliverable files. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This was referenced May 7, 2026
wei9072
added a commit
that referenced
this pull request
May 7, 2026
Closes the audit gaps identified after PRs #10 / #12. The audit across SEC003-008: | Rule | Coverage before | Action this PR | |---|---|---| | SEC003 TLS off | text-level Python/Node/Go/.NET | unchanged (decent already) | | SEC004 shell | Python `shell=True`+interp only | **language-aware dispatch** | | SEC005 SQL concat | text+string-literal Python/Java | unchanged (decent) | | SEC006 CORS | text-level cross-language | unchanged | | SEC007 JWT | `name.contains("jwt")` Python only | **language-aware dispatch** | | SEC008 deser | Python/Node/Java idioms | unchanged (decent) | ## SEC004 expansion Per-language shell-running idioms; requires interpolation in arg. - **Python**: subprocess.run/Popen with `shell=True` + interp - **Node.js**: `child_process.exec` / `execSync` with interp (always shells out, no `shell:true` gate; `execFile` is the safe one) - **PHP**: global `shell_exec` / `exec` / `passthru` / `system` / `proc_open` with interp - **Java**: `Runtime.getRuntime().exec(String)` overload with concat — String[] overload safe and excluded - **Go**: `exec.Command("sh"|"bash"|"/bin/sh"|"/bin/bash", "-c", ...)` with interp. Bare `exec.Command("ls", arg)` (argv-style) excluded — no shell metachar interpretation `text_has_interp` extended with PHP `.` concat (gated on `$` to avoid floating-point literals). ## SEC007 expansion Per-language JWT decode without verification: - **Python**: `jwt.decode(...)` without algorithms/key/verify kwarg (existing behaviour) - **Node.js**: `jsonwebtoken.decode()` always returns unverified claims — flag unconditionally; `verify(token, secret, opts)` is the safe API. `verify()` with `verify: false` opt also flagged. - **Java / Kotlin**: Auth0 lib's `JWT.decode(token)` returns unverified DecodedJWT; safe path is `JWT.require(...).build().verify(token)`. - **PHP**: firebase/php-jwt's `JWT::decode($token, $key)` requires explicit algorithm list. Flagged unless one of `'HS256'`/`'RS256'`/ `'ES256'`/`'EdDSA'` appears in call text. Algorithm-`none` detection extended with JWT-spec literal `"alg": "none"` shape. `check_jwt_unsafe` now takes `&ParsedFile` so language identity is available — prevents PHP `JWT::decode` from being misclassified as Java (the old `name.contains("JWT")` check was language-blind). ## Infrastructure changes 1. **`call_name` extended for PHP scoped/member calls.** Previously only handled Java's `method_invocation`; now also composes `Class::method` from `scoped_call_expression` and `$obj->method` from `member_call_expression`. 2. **`leaf_method_name(name)` helper** — splits on `.` / `::` / `->` so `JWT::decode`'s leaf is `decode`, not the whole string. 3. **walk dispatch** extended with `scoped_call_expression` and `member_call_expression` node kinds. ## Tests 10 new (5 SEC004 multi-lang + 5 SEC007 multi-lang). 174 → **177** total tests passing. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #10 fixed multi-language coverage for SEC010 (weak RNG); Round 9
validation revealed SEC009 (weak hash) had the same coverage
gap. The old matcher (`last == "md5" | "sha1" | ...`) only
fired on Python `hashlib.md5(...)`. Real-world Go and Java auth
code went uncaught:
```go
import "crypto/md5"
func HashPassword(password string) string {
h := md5.Sum([]byte(password)) // <-- silent before PR #12
...
}
```
```java
import java.security.MessageDigest;
public static String hashPassword(String password) throws Exception {
MessageDigest md = MessageDigest.getInstance("MD5"); // <-- silent before
...
}
```
This PR mirrors PR #10's per-language dispatch architecture for
SEC009.
Per-language matchers
New helper `first_arg_is_weak_alg_string()` inspects the call's
first string-literal argument and matches against `md5` / `sha1` /
`sha-1` (case-insensitive). Used by Node, Java, and PHP.
enclosing_security_context fix (PR #11 parallel)
Same bugs PR #11 fixed in `enclosing_token_context`:
`short_var_declaration`, Java `local_variable_declaration`, etc.)
production case: `func HashPassword(password string)` carries the
`password` needle through its name even when the local
assignment uses generic `h := ...`.
the search).
Tests
8 new SEC009 tests covering each language's positive and negative
cases. Plus the `md5_for_etag_does_not_block` Python negative is
preserved (etag context must stay silent).
`cargo test --workspace`: 164 / 164 pass (was 156; +8 new).
Live MCP confirmation
Round 9's starting-go/auth.go before this PR:
```
FINDINGS: 19 total (security=2)
[security] SEC012 @line 17 ...
[security] SEC010 @line 26 ...
```
After PR #12:
```
FINDINGS: 20 total (security=3)
[security] SEC009 @line 11 weak hash ... (`password`)
[security] SEC012 @line 17 ...
[security] SEC010 @line 26 ...
```
Round 9's starting-java/Auth.java similarly goes from 1 → 3 SEC
findings.
Test plan
🤖 Generated with Claude Code