Skip to content

feat(security): SEC009 language-aware dispatch (PR #12)#12

Merged
wei9072 merged 1 commit into
mainfrom
feat/sec009-multilang
May 6, 2026
Merged

feat(security): SEC009 language-aware dispatch (PR #12)#12
wei9072 merged 1 commit into
mainfrom
feat/sec009-multilang

Conversation

@wei9072
Copy link
Copy Markdown
Owner

@wei9072 wei9072 commented May 6, 2026

Summary

PR #10 fixed multi-language coverage for SEC010 (weak RNG); Round 9
validation revealed SEC009 (weak hash) had the same coverage
gap. The old matcher (`last == "md5" | "sha1" | ...`) only
fired on Python `hashlib.md5(...)`. Real-world Go and Java auth
code went uncaught:

```go
import "crypto/md5"
func HashPassword(password string) string {
h := md5.Sum([]byte(password)) // <-- silent before PR #12
...
}
```

```java
import java.security.MessageDigest;
public static String hashPassword(String password) throws Exception {
MessageDigest md = MessageDigest.getInstance("MD5"); // <-- silent before
...
}
```

This PR mirrors PR #10's per-language dispatch architecture for
SEC009.

Per-language matchers

Language Match shape Notes
Python `hashlib.md5` / `hashlib.sha1` unchanged
Node / JS `crypto.createHash('md5'|'sha1')` algo in first string arg
Go `md5.Sum` / `md5.New` / `sha1.X` Layer 1 resolves `crypto/md5` vs same name from elsewhere
Java / Kotlin `MessageDigest.getInstance("MD5"|"SHA-1"|"SHA1")` + Apache `DigestUtils.md5Hex`/`sha1Hex` algo in string arg
C# `MD5.Create` / `SHA1.Create` / `MD5CryptoServiceProvider` / `MD5Managed` receiver-anchored to avoid `SHA1024` collisions
PHP global `md5`/`sha1` + `hash('md5'|'sha1', ...)` algo in string arg for `hash()`

New helper `first_arg_is_weak_alg_string()` inspects the call's
first string-literal argument and matches against `md5` / `sha1` /
`sha-1` (case-insensitive). Used by Node, Java, and PHP.

enclosing_security_context fix (PR #11 parallel)

Same bugs PR #11 fixed in `enclosing_token_context`:

  • Multi-language assignment node kinds added (Go
    `short_var_declaration`, Java `local_variable_declaration`, etc.)
  • Function-name needle check at function-shape level. Round 9
    production case: `func HashPassword(password string)` carries the
    `password` needle through its name even when the local
    assignment uses generic `h := ...`.
  • Walks past inner blocks (loop bodies / if bodies don't terminate
    the search).

Tests

8 new SEC009 tests covering each language's positive and negative
cases. Plus the `md5_for_etag_does_not_block` Python negative is
preserved (etag context must stay silent).

`cargo test --workspace`: 164 / 164 pass (was 156; +8 new).

Live MCP confirmation

Round 9's starting-go/auth.go before this PR:
```
FINDINGS: 19 total (security=2)
[security] SEC012 @line 17 ...
[security] SEC010 @line 26 ...
```

After PR #12:
```
FINDINGS: 20 total (security=3)
[security] SEC009 @line 11 weak hash ... (`password`)
[security] SEC012 @line 17 ...
[security] SEC010 @line 26 ...
```

Round 9's starting-java/Auth.java similarly goes from 1 → 3 SEC
findings.

Test plan

🤖 Generated with Claude Code

Round 9 validation surfaced that SEC009's `last == "md5" | "sha1"`
matcher only fired on Python `hashlib.md5(...)` and never on Go,
Java, C#, PHP, or Node — exactly the same multi-language coverage
gap PR #10 fixed for SEC010.

Per-language matchers, mirroring PR #10's architecture:

- **Python**: `hashlib.md5` / `hashlib.sha1`. Receiver-anchored,
  excludes `Crypto.Hash.MD5.new` from PyCryptodome (out of scope
  for now).
- **Node / JS**: `crypto.createHash('md5'|'sha1'|'sha-1')`. The
  algorithm lives in the first string arg; new
  `first_arg_is_weak_alg_string()` helper inspects the literal.
- **Go**: `md5.Sum` / `md5.New` / `sha1.Sum` / `sha1.New` with
  Layer 1 import resolution against `crypto/md5` and `crypto/sha1`.
  Round 9's `h := md5.Sum([]byte(password))` case.
- **Java / Kotlin**: `MessageDigest.getInstance("MD5")` (string
  arg), Apache Commons `DigestUtils.md5Hex` / `sha1Hex`. Round 9's
  `MessageDigest.getInstance("MD5")` case.
- **C#**: `MD5.Create()` / `SHA1.Create()` / `MD5CryptoServiceProvider`
  / `MD5Managed` and SHA1 equivalents. Receiver-anchored to avoid
  matching `SHA1024` / `SomeMD5Field`.
- **PHP**: global `md5(...)` / `sha1(...)`, plus `hash('md5', ...)`
  / `hash('sha1', ...)` (string arg). `hash('sha256', ...)` stays
  silent.

Also fixes `enclosing_security_context` with the same improvements
PR #11 applied to `enclosing_token_context`:

- Multi-language assignment node kinds (Go `short_var_declaration`,
  Java `local_variable_declaration`, etc.).
- Function-name needle check at function-shape level. Round 9 case:
  `func HashPassword(password string)` calls `md5.Sum` via local
  `h :=`; the function name carries the `password` needle even
  though the local assignment doesn't.
- Walks past inner blocks (don't break at for / if body, only at
  function shape).

Tests: 8 new — Node createHash md5 (positive) / sha256 (negative),
Go md5.Sum + crypto/md5 import, Java MessageDigest.getInstance MD5
(positive) / SHA-256 (negative), PHP md5 (positive) / hash sha256
(negative), C# MD5.Create.

164 / 164 tests pass.

Live MCP confirmation: Round 9's starting-go/auth.go and
starting-java/Auth.java now both fire all three planted findings
(SEC009, SEC010, SEC012) instead of 2 / 1 respectively.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@wei9072 wei9072 merged commit 956c62e into main May 6, 2026
1 check passed
@wei9072 wei9072 deleted the feat/sec009-multilang branch May 6, 2026 04:07
wei9072 added a commit that referenced this pull request May 7, 2026
Adds the empirical validation work that drove PRs #6#12 into
the repo as a reproducibility archive under `experiments/`.

Contents:

- `comparison-report.md` (1199 lines, rolling Round 1 → Round 9
  analysis). Documents the three Aegis ROI mechanisms surfaced
  from data:
    1. Rule-hit → fix (brownfield Plan B: 0/3 → 3/3 across 3 models)
    2. Structural guardrail (cycle / public_symbol_removed —
       0/14 hits, dead weight on clean architectures)
    3. Anti-paralysis ritual (weak models complete tasks they
       would otherwise abandon)
- 4 starting-code fixtures (Python brownfield, Go brownfield,
  Java brownfield, Python multi-module).
- 11 prompt files. Each task has paired `-a.txt` (no Aegis) and
  `-b.txt` (with Aegis MCP + REQUIRED-workflow ritual instruction).
- 52 round directories: per-model deliverables + `run.log` for
  codex-driven rounds. Naming: <model>-<task>-<a|b>.
- `aegis_validate.py` — Python wrapper around aegis-mcp stdio
  JSON-RPC, used by agents to run validation after each write.
- 3 eval scripts that diff each round's deliverables against the
  planted SEC bugs.

Excluded via `.gitignore` and rsync filter (would have been ~970MB
of bloat): venv / .venv / __pycache__ / .pytest_cache / .toolchain
(Go toolchain copies codex downloads) / compiled binaries / git
metadata of nested repos. Final archive: 17MB.

Direct lineage from this archive into Aegis code:

| Round | Surfaced | Fixed in |
|---|---|---|
| Round 8 codex | SEC010 FP on `secrets.choice` | PR #9 |
| Round 9 Go / Java | SEC009 multi-language coverage = 0 | PR #12 |
| Round 9 Java | SEC010 inner-block `break` hid production case | PR #11 |

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
wei9072 added a commit that referenced this pull request May 7, 2026
#14)

- Main README.md and README.zh-TW.md: new "Experiments archive" /
  「實驗資料」 section linking to experiments/README.md
- experiments/README.md: replaces the prose-only intro with four
  analysis charts:
    1. Round structure overview (52 dirs, 26 paired, 11 models)
    2. Plan B brownfield 0/3 → 3/3 fix-rate matrix with per-model
       remaining-bug counts (real numbers from re-validating the
       archive against current rule library)
    3. Plan C multi-module task-completion matrix surfacing the
       anti-paralysis ROI mechanism (g54mini-mc-a abandoned vs
       g54mini-mc-b completed same task)
    4. Mermaid flowchart of the three Aegis ROI mechanisms
       (rule-hit → fix; structural guardrail; anti-paralysis ritual)
- Plus a "Direct lineage" chart connecting each experiment finding
  to the PR that fixed it (Round 8 → PR #9; Round 9 → PR #11/#12).

Also re-imports `experiments/31flashlite-amb-b/` as plain files —
the previous commit accidentally captured it as a gitlink because
codex had created a nested `.git/` inside the round dir during the
agent run. Removed the nested `.git/` and re-added the 6 actual
deliverable files.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
wei9072 added a commit that referenced this pull request May 7, 2026
Closes the audit gaps identified after PRs #10 / #12. The audit
across SEC003-008:

| Rule | Coverage before | Action this PR |
|---|---|---|
| SEC003 TLS off  | text-level Python/Node/Go/.NET | unchanged (decent already) |
| SEC004 shell    | Python `shell=True`+interp only | **language-aware dispatch** |
| SEC005 SQL concat | text+string-literal Python/Java | unchanged (decent) |
| SEC006 CORS     | text-level cross-language | unchanged |
| SEC007 JWT      | `name.contains("jwt")` Python only | **language-aware dispatch** |
| SEC008 deser    | Python/Node/Java idioms | unchanged (decent) |

## SEC004 expansion

Per-language shell-running idioms; requires interpolation in arg.

- **Python**: subprocess.run/Popen with `shell=True` + interp
- **Node.js**: `child_process.exec` / `execSync` with interp (always
  shells out, no `shell:true` gate; `execFile` is the safe one)
- **PHP**: global `shell_exec` / `exec` / `passthru` / `system` /
  `proc_open` with interp
- **Java**: `Runtime.getRuntime().exec(String)` overload with
  concat — String[] overload safe and excluded
- **Go**: `exec.Command("sh"|"bash"|"/bin/sh"|"/bin/bash", "-c", ...)`
  with interp. Bare `exec.Command("ls", arg)` (argv-style) excluded
  — no shell metachar interpretation

`text_has_interp` extended with PHP `.` concat (gated on `$` to
avoid floating-point literals).

## SEC007 expansion

Per-language JWT decode without verification:

- **Python**: `jwt.decode(...)` without algorithms/key/verify kwarg
  (existing behaviour)
- **Node.js**: `jsonwebtoken.decode()` always returns unverified
  claims — flag unconditionally; `verify(token, secret, opts)` is
  the safe API. `verify()` with `verify: false` opt also flagged.
- **Java / Kotlin**: Auth0 lib's `JWT.decode(token)` returns
  unverified DecodedJWT; safe path is
  `JWT.require(...).build().verify(token)`.
- **PHP**: firebase/php-jwt's `JWT::decode($token, $key)` requires
  explicit algorithm list. Flagged unless one of `'HS256'`/`'RS256'`/
  `'ES256'`/`'EdDSA'` appears in call text.

Algorithm-`none` detection extended with JWT-spec literal
`"alg": "none"` shape.

`check_jwt_unsafe` now takes `&ParsedFile` so language identity
is available — prevents PHP `JWT::decode` from being misclassified
as Java (the old `name.contains("JWT")` check was language-blind).

## Infrastructure changes

1. **`call_name` extended for PHP scoped/member calls.** Previously
   only handled Java's `method_invocation`; now also composes
   `Class::method` from `scoped_call_expression` and
   `$obj->method` from `member_call_expression`.
2. **`leaf_method_name(name)` helper** — splits on `.` / `::` /
   `->` so `JWT::decode`'s leaf is `decode`, not the whole string.
3. **walk dispatch** extended with `scoped_call_expression` and
   `member_call_expression` node kinds.

## Tests

10 new (5 SEC004 multi-lang + 5 SEC007 multi-lang). 174 → **177**
total tests passing.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant