feat(security): SEC009 language-aware dispatch (PR #12) by wei9072 · Pull Request #12 · wei9072/aegis

wei9072 · 2026-05-06T04:06:47Z

Summary

PR #10 fixed multi-language coverage for SEC010 (weak RNG); Round 9
validation revealed SEC009 (weak hash) had the same coverage
gap. The old matcher (`last == "md5" | "sha1" | ...`) only
fired on Python `hashlib.md5(...)`. Real-world Go and Java auth
code went uncaught:

```go
import "crypto/md5"
func HashPassword(password string) string {
h := md5.Sum([]byte(password)) // <-- silent before PR #12
...
}
```

```java
import java.security.MessageDigest;
public static String hashPassword(String password) throws Exception {
MessageDigest md = MessageDigest.getInstance("MD5"); // <-- silent before
...
}
```

This PR mirrors PR #10's per-language dispatch architecture for
SEC009.

Per-language matchers

Language	Match shape	Notes
Python	`hashlib.md5` / `hashlib.sha1`	unchanged
Node / JS	`crypto.createHash('md5'\|'sha1')`	algo in first string arg
Go	`md5.Sum` / `md5.New` / `sha1.X`	Layer 1 resolves `crypto/md5` vs same name from elsewhere
Java / Kotlin	`MessageDigest.getInstance("MD5"\|"SHA-1"\|"SHA1")` + Apache `DigestUtils.md5Hex`/`sha1Hex`	algo in string arg
C#	`MD5.Create` / `SHA1.Create` / `MD5CryptoServiceProvider` / `MD5Managed`	receiver-anchored to avoid `SHA1024` collisions
PHP	global `md5`/`sha1` + `hash('md5'\|'sha1', ...)`	algo in string arg for `hash()`

New helper `first_arg_is_weak_alg_string()` inspects the call's
first string-literal argument and matches against `md5` / `sha1` /
`sha-1` (case-insensitive). Used by Node, Java, and PHP.

enclosing_security_context fix (PR #11 parallel)

Same bugs PR #11 fixed in `enclosing_token_context`:

Multi-language assignment node kinds added (Go
`short_var_declaration`, Java `local_variable_declaration`, etc.)
Function-name needle check at function-shape level. Round 9
production case: `func HashPassword(password string)` carries the
`password` needle through its name even when the local
assignment uses generic `h := ...`.
Walks past inner blocks (loop bodies / if bodies don't terminate
the search).

Tests

8 new SEC009 tests covering each language's positive and negative
cases. Plus the `md5_for_etag_does_not_block` Python negative is
preserved (etag context must stay silent).

`cargo test --workspace`: 164 / 164 pass (was 156; +8 new).

Live MCP confirmation

Round 9's starting-go/auth.go before this PR:
```
FINDINGS: 19 total (security=2)
[security] SEC012 @line 17 ...
[security] SEC010 @line 26 ...
```

After PR #12:
```
FINDINGS: 20 total (security=3)
[security] SEC009 @line 11 weak hash ... (`password`)
[security] SEC012 @line 17 ...
[security] SEC010 @line 26 ...
```

Round 9's starting-java/Auth.java similarly goes from 1 → 3 SEC
findings.

Test plan

`cargo test --workspace` — 164 / 164 pass
Live MCP fires SEC009 on Round 9 Go + Java starting code
Existing SEC009 etag-context negative still passes
PR fix(security): SEC010 FP on secrets.choice / os.urandom #9, feat(aegis-core): Layer 1 per-file fact cache + SEC010 multi-language dispatch #10, fix(security): SEC010 reads function name + walks past inner blocks #11 regression tests still pass
CI green on push

🤖 Generated with Claude Code

Round 9 validation surfaced that SEC009's `last == "md5" | "sha1"` matcher only fired on Python `hashlib.md5(...)` and never on Go, Java, C#, PHP, or Node — exactly the same multi-language coverage gap PR #10 fixed for SEC010. Per-language matchers, mirroring PR #10's architecture: - **Python**: `hashlib.md5` / `hashlib.sha1`. Receiver-anchored, excludes `Crypto.Hash.MD5.new` from PyCryptodome (out of scope for now). - **Node / JS**: `crypto.createHash('md5'|'sha1'|'sha-1')`. The algorithm lives in the first string arg; new `first_arg_is_weak_alg_string()` helper inspects the literal. - **Go**: `md5.Sum` / `md5.New` / `sha1.Sum` / `sha1.New` with Layer 1 import resolution against `crypto/md5` and `crypto/sha1`. Round 9's `h := md5.Sum([]byte(password))` case. - **Java / Kotlin**: `MessageDigest.getInstance("MD5")` (string arg), Apache Commons `DigestUtils.md5Hex` / `sha1Hex`. Round 9's `MessageDigest.getInstance("MD5")` case. - **C#**: `MD5.Create()` / `SHA1.Create()` / `MD5CryptoServiceProvider` / `MD5Managed` and SHA1 equivalents. Receiver-anchored to avoid matching `SHA1024` / `SomeMD5Field`. - **PHP**: global `md5(...)` / `sha1(...)`, plus `hash('md5', ...)` / `hash('sha1', ...)` (string arg). `hash('sha256', ...)` stays silent. Also fixes `enclosing_security_context` with the same improvements PR #11 applied to `enclosing_token_context`: - Multi-language assignment node kinds (Go `short_var_declaration`, Java `local_variable_declaration`, etc.). - Function-name needle check at function-shape level. Round 9 case: `func HashPassword(password string)` calls `md5.Sum` via local `h :=`; the function name carries the `password` needle even though the local assignment doesn't. - Walks past inner blocks (don't break at for / if body, only at function shape). Tests: 8 new — Node createHash md5 (positive) / sha256 (negative), Go md5.Sum + crypto/md5 import, Java MessageDigest.getInstance MD5 (positive) / SHA-256 (negative), PHP md5 (positive) / hash sha256 (negative), C# MD5.Create. 164 / 164 tests pass. Live MCP confirmation: Round 9's starting-go/auth.go and starting-java/Auth.java now both fire all three planted findings (SEC009, SEC010, SEC012) instead of 2 / 1 respectively. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds the empirical validation work that drove PRs #6 — #12 into the repo as a reproducibility archive under `experiments/`. Contents: - `comparison-report.md` (1199 lines, rolling Round 1 → Round 9 analysis). Documents the three Aegis ROI mechanisms surfaced from data: 1. Rule-hit → fix (brownfield Plan B: 0/3 → 3/3 across 3 models) 2. Structural guardrail (cycle / public_symbol_removed — 0/14 hits, dead weight on clean architectures) 3. Anti-paralysis ritual (weak models complete tasks they would otherwise abandon) - 4 starting-code fixtures (Python brownfield, Go brownfield, Java brownfield, Python multi-module). - 11 prompt files. Each task has paired `-a.txt` (no Aegis) and `-b.txt` (with Aegis MCP + REQUIRED-workflow ritual instruction). - 52 round directories: per-model deliverables + `run.log` for codex-driven rounds. Naming: <model>-<task>-<a|b>. - `aegis_validate.py` — Python wrapper around aegis-mcp stdio JSON-RPC, used by agents to run validation after each write. - 3 eval scripts that diff each round's deliverables against the planted SEC bugs. Excluded via `.gitignore` and rsync filter (would have been ~970MB of bloat): venv / .venv / __pycache__ / .pytest_cache / .toolchain (Go toolchain copies codex downloads) / compiled binaries / git metadata of nested repos. Final archive: 17MB. Direct lineage from this archive into Aegis code: | Round | Surfaced | Fixed in | |---|---|---| | Round 8 codex | SEC010 FP on `secrets.choice` | PR #9 | | Round 9 Go / Java | SEC009 multi-language coverage = 0 | PR #12 | | Round 9 Java | SEC010 inner-block `break` hid production case | PR #11 | Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

#14) - Main README.md and README.zh-TW.md: new "Experiments archive" / 「實驗資料」 section linking to experiments/README.md - experiments/README.md: replaces the prose-only intro with four analysis charts: 1. Round structure overview (52 dirs, 26 paired, 11 models) 2. Plan B brownfield 0/3 → 3/3 fix-rate matrix with per-model remaining-bug counts (real numbers from re-validating the archive against current rule library) 3. Plan C multi-module task-completion matrix surfacing the anti-paralysis ROI mechanism (g54mini-mc-a abandoned vs g54mini-mc-b completed same task) 4. Mermaid flowchart of the three Aegis ROI mechanisms (rule-hit → fix; structural guardrail; anti-paralysis ritual) - Plus a "Direct lineage" chart connecting each experiment finding to the PR that fixed it (Round 8 → PR #9; Round 9 → PR #11/#12). Also re-imports `experiments/31flashlite-amb-b/` as plain files — the previous commit accidentally captured it as a gitlink because codex had created a nested `.git/` inside the round dir during the agent run. Removed the nested `.git/` and re-added the 6 actual deliverable files. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Closes the audit gaps identified after PRs #10 / #12. The audit across SEC003-008: | Rule | Coverage before | Action this PR | |---|---|---| | SEC003 TLS off | text-level Python/Node/Go/.NET | unchanged (decent already) | | SEC004 shell | Python `shell=True`+interp only | **language-aware dispatch** | | SEC005 SQL concat | text+string-literal Python/Java | unchanged (decent) | | SEC006 CORS | text-level cross-language | unchanged | | SEC007 JWT | `name.contains("jwt")` Python only | **language-aware dispatch** | | SEC008 deser | Python/Node/Java idioms | unchanged (decent) | ## SEC004 expansion Per-language shell-running idioms; requires interpolation in arg. - **Python**: subprocess.run/Popen with `shell=True` + interp - **Node.js**: `child_process.exec` / `execSync` with interp (always shells out, no `shell:true` gate; `execFile` is the safe one) - **PHP**: global `shell_exec` / `exec` / `passthru` / `system` / `proc_open` with interp - **Java**: `Runtime.getRuntime().exec(String)` overload with concat — String[] overload safe and excluded - **Go**: `exec.Command("sh"|"bash"|"/bin/sh"|"/bin/bash", "-c", ...)` with interp. Bare `exec.Command("ls", arg)` (argv-style) excluded — no shell metachar interpretation `text_has_interp` extended with PHP `.` concat (gated on `$` to avoid floating-point literals). ## SEC007 expansion Per-language JWT decode without verification: - **Python**: `jwt.decode(...)` without algorithms/key/verify kwarg (existing behaviour) - **Node.js**: `jsonwebtoken.decode()` always returns unverified claims — flag unconditionally; `verify(token, secret, opts)` is the safe API. `verify()` with `verify: false` opt also flagged. - **Java / Kotlin**: Auth0 lib's `JWT.decode(token)` returns unverified DecodedJWT; safe path is `JWT.require(...).build().verify(token)`. - **PHP**: firebase/php-jwt's `JWT::decode($token, $key)` requires explicit algorithm list. Flagged unless one of `'HS256'`/`'RS256'`/ `'ES256'`/`'EdDSA'` appears in call text. Algorithm-`none` detection extended with JWT-spec literal `"alg": "none"` shape. `check_jwt_unsafe` now takes `&ParsedFile` so language identity is available — prevents PHP `JWT::decode` from being misclassified as Java (the old `name.contains("JWT")` check was language-blind). ## Infrastructure changes 1. **`call_name` extended for PHP scoped/member calls.** Previously only handled Java's `method_invocation`; now also composes `Class::method` from `scoped_call_expression` and `$obj->method` from `member_call_expression`. 2. **`leaf_method_name(name)` helper** — splits on `.` / `::` / `->` so `JWT::decode`'s leaf is `decode`, not the whole string. 3. **walk dispatch** extended with `scoped_call_expression` and `member_call_expression` node kinds. ## Tests 10 new (5 SEC004 multi-lang + 5 SEC007 multi-lang). 174 → **177** total tests passing. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

wei9072 merged commit 956c62e into main May 6, 2026
1 check passed

wei9072 deleted the feat/sec009-multilang branch May 6, 2026 04:07

wei9072 mentioned this pull request May 7, 2026

docs(experiments): archive 9 rounds of A/B agent experiments #13

Merged

5 tasks

wei9072 mentioned this pull request May 7, 2026

docs: link experiments archive + add analysis charts #14

Merged

5 tasks

This was referenced May 7, 2026

feat(aegis-core): Layer 2 Go import alias resolution (PR #17) #17

Merged

feat(security): SEC004 + SEC007 multi-language dispatch (PR #18) #18

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(security): SEC009 language-aware dispatch (PR #12)#12

feat(security): SEC009 language-aware dispatch (PR #12)#12
wei9072 merged 1 commit into
mainfrom
feat/sec009-multilang

wei9072 commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wei9072 commented May 6, 2026

Summary

Per-language matchers

enclosing_security_context fix (PR #11 parallel)

Tests

Live MCP confirmation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant