fix(security): SEC010 FP on secrets.choice / os.urandom by wei9072 · Pull Request #9 · wei9072/aegis

wei9072 · 2026-05-06T02:07:16Z

Summary

Round 8 of the codex agent experiment (gpt-5.4-mini × Plan A
ambiguous spec) surfaced a false positive that PR #6's SEC010
expansion introduced (or made far more visible): the SECURE choice,
`secrets.choice`, was being flagged as weak RNG.

The bug:

import secrets, string
ALPHANUMERIC = string.ascii_letters + string.digits
def make_code():
    return "".join(secrets.choice(ALPHANUMERIC) for _ in range(8))

`secrets.choice` shares the function name `choice` with
`random.choice`. The match was on the bare last segment, so both
fired. Combined with PR #6's opaque-id heuristic (which detects
`string.ascii_letters` in the enclosing function), every URL
shortener using `secrets` got flagged as if it were using
`random`. Active danger: the agent might "fix" secure code into
something worse.

Fix: add a known-secure module allowlist at the top of
`check_weak_random_for_token`:

Python: `secrets.`, `os.urandom`
Node / browser: `crypto.`, `window.crypto.`, `self.crypto.`,
`subtle.`
Go: `crypto/`, `rand.read`, `rand.int` (for crypto/rand)
Other: `nacl.`, `tls.`

If the call name contains any of these, return early — even if the
last segment matches a weak-RNG name.

Test plan

`cargo test --workspace` — 132 / 132 pass (was 130; +2 new
regression tests)
Live MCP run on the FP shape: previously SEC010 fires,
now silent
Existing positive tests still fire (`random.choice` / Math.random
for token still flagged)
CI green on push

🤖 Generated with Claude Code

Round 8 codex experiment surfaced an FP that PR #6's SEC010 expansion introduced (or made far more visible): `secrets.choice` on a character pool was getting flagged as weak RNG, even though `secrets.choice` is the SECURE choice — the same `choice` function name shared with `random.choice` was matching unconditionally. The shape that fired falsely: import secrets, string ALPHANUMERIC = string.ascii_letters + string.digits def make_code(): return "".join(secrets.choice(ALPHANUMERIC) for _ in range(8)) Receiver module is `secrets`, not `random` — should pass. Fix: at the top of `check_weak_random_for_token`, check the full call name against an allowlist of known-cryptographic RNG modules (`secrets.`, `os.urandom`, `crypto.`, `nacl.`, `subtle.`, `rand.read`/`rand.int` for Go crypto/rand, `window.crypto.`, `self.crypto.`). Match anywhere in the name to handle dotted / slashed forms. Tests: 2 new — `secrets.choice` over ascii pool (was the FP from the experiment), `os.urandom` for token (different shape, same class). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… dispatch (#10) * feat(aegis-core): ast::imports — per-file Import struct + extractor Introduces the first per-file Layer 1 fact-extraction module. Every language adapter declares an `import_query` capturing module names as `@import`; until now signals/imports_local.rs and workspace.rs each ran that query independently. This pulls the common pattern into one place so future per-file fact derivation (security receiver resolution, public symbols, etc.) has the same home. `extract_imports(parsed) -> Vec<Import>` runs the adapter query once and returns normalized module strings with line numbers. Pure function; caching is the next commit's job (`ParsedFile.imports()` lazy cache). 5 tests cover Python plain/from-imports, Go quoted module paths, JS/TS quote stripping, line numbering (1-indexed), and the contract that callers must hold a ParsedFile (unparseable extensions can't reach this code path). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(aegis-core): ParsedFile.imports() lazy cache + resolve_receiver() Adds an `OnceCell<Vec<Import>>` to ParsedFile so the import query runs at most once per file. Single-threaded by design — ParsedFile is owned by gather_findings and not shared across threads, so std::cell::OnceCell suffices. New API: - `ParsedFile::imports() -> &[Import]` — lazy cache populated on first call. - `ParsedFile::resolve_receiver(name) -> Option<&Import>` — best- effort receiver-to-import lookup. Today: last-segment of module path or full module name. Alias-aware resolution (Go `import myrand "math/rand"`) is intentionally deferred — needs language-specific AST shapes. 5 new tests cover cache population, cache stability across calls (identical slice pointer), Python module-name resolution, Go last-segment path resolution (`math/rand` → `rand`), and the empty-string boundary. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(aegis-core): migrate consumers to ParsedFile.imports() cache Both signals/imports_local.rs and workspace.rs::summarize_file used to run their own copy of: let query = Query::new(lang, adapter.import_query()).ok()?; let mut qc = QueryCursor::new(); for m in qc.matches(&query, parsed.root_node(), src) { for cap in m.captures { ... adapter.normalize_import(text) ... } } That work now lives in ParsedFile.imports() and runs at most once per file. Both call sites collapse to a single iterator over `parsed.imports()`. Behavioural delta: none. Both consumers were already de-duplicating on the module string (HashSet in workspace.rs, line-counting in imports_local.rs); the cache returns a `Vec<Import>` with each distinct (module, line) pair, so consumer behaviour is preserved. Drops three unused tree-sitter type imports from workspace.rs (Parser, Query, QueryCursor) — they were only used by the now-gone inline import-extraction. The pre-existing unused `ParsedFile` import in findings.rs is separate scope; left alone. 142 / 142 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(security): SEC010 language-aware dispatch over Layer 1 imports Replaces the polyglot needle list (`choice` | `randint` | ... | `Math.random`) with per-language matchers. Each language's matcher hard-codes the receiver path, so safe APIs that share function names (`secrets.choice` shares `choice`, `SecureRandom.nextInt` shares `nextInt`, `crypto/rand.Read` shares `Read`) never match. Per-language coverage: - **Python**: `random.X` only (`X` ∈ choice/choices/randint/sample/ uniform/randrange/shuffle/random). Excludes `secrets.X`, `os.urandom`, `np.random.X`, `random.SystemRandom().X`. - **JS / TS**: `Math.random` only. - **Go**: `math/rand` method set, with import-resolution via `parsed.resolve_receiver()` to disambiguate from `crypto/rand`. `import "math/rand"` → `rand.Intn` flags; `import "crypto/rand"` → `rand.Read` doesn't (and `Intn`/`Int31`/etc. don't even exist there). Falls back to method-name conservative default when no matching import is in the file. - **Java / Kotlin**: `nextInt`/`nextLong`/etc. on `Random` (or unqualified). `SecureRandom`-prefixed receivers excluded by receiver-path filter. - **C#**: `Next`/`NextDouble`/`NextBytes`. Excludes `RandomNumberGenerator` and any `Cryptography`-namespace receiver. - **PHP**: global `rand`/`mt_rand`/`array_rand`. `random_int` and `random_bytes` are CSPRNG and intentionally absent. - **Rust**: `gen`/`gen_range`/`gen_bool` from the rand crate. `OsRng` excluded by receiver-path filter. Drops PR #9's `safe_module_prefixes` whitelist hack — now redundant. Each language's matcher hard-codes the receiver path, so `secrets.choice` never reaches the SEC010 path in the first place because `is_python_weak_rng` requires `first == "random"`. Threads `&ParsedFile` down through `walk()` so language identity and import facts are available at the SEC010 call site. Other SEC checks unchanged. Existing 142 tests pass without modification — the dispatch preserves semantics for everything that was already flagging. Per-language regression tests added in the next commit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(security): SEC010 per-language regression coverage 10 new tests covering the dispatch matrix from PR #10 step 4: Python: - random.SystemRandom() CSPRNG-backed → does NOT fire - numpy np.random.choice (statistical sampling) → does NOT fire JS / TS: - Math.random for token (chained .toString[2] form) → fires - window.crypto.getRandomValues → does NOT fire Go (with Layer 1 import resolution): - import "math/rand" + rand.Intn → fires - import "crypto/rand" + rand.Read → does NOT fire (same literal `rand.Read` text, different module per import resolution) Java: - new Random().nextInt(...) literal → fires - new SecureRandom().nextInt(...) literal → does NOT fire Note: variable-stored (`Random r = new Random(); r.nextInt`) is a known FN — without dataflow we can't recover the class identity from the receiver `r`. Worth less than the FP risk on `SecureRandom r; r.nextInt`. C# / PHP: - new Random().Next(...) → fires - PHP rand($min, $max) → fires - PHP random_int($min, $max) (CSPRNG) → does NOT fire Also fixes call-name extraction for Java's `method_invocation`, which lacks a `function` field — composes `object.name` from the two child fields so the receiver path is visible to the matcher. And extends `enclosing_token_context` to recognize Go (var_spec / short_var_declaration), Java (local_variable_declaration), and JS augmented assignment kinds, so the token-context detection works across the languages now in scope. 153 / 153 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Adds the empirical validation work that drove PRs #6 — #12 into the repo as a reproducibility archive under `experiments/`. Contents: - `comparison-report.md` (1199 lines, rolling Round 1 → Round 9 analysis). Documents the three Aegis ROI mechanisms surfaced from data: 1. Rule-hit → fix (brownfield Plan B: 0/3 → 3/3 across 3 models) 2. Structural guardrail (cycle / public_symbol_removed — 0/14 hits, dead weight on clean architectures) 3. Anti-paralysis ritual (weak models complete tasks they would otherwise abandon) - 4 starting-code fixtures (Python brownfield, Go brownfield, Java brownfield, Python multi-module). - 11 prompt files. Each task has paired `-a.txt` (no Aegis) and `-b.txt` (with Aegis MCP + REQUIRED-workflow ritual instruction). - 52 round directories: per-model deliverables + `run.log` for codex-driven rounds. Naming: <model>-<task>-<a|b>. - `aegis_validate.py` — Python wrapper around aegis-mcp stdio JSON-RPC, used by agents to run validation after each write. - 3 eval scripts that diff each round's deliverables against the planted SEC bugs. Excluded via `.gitignore` and rsync filter (would have been ~970MB of bloat): venv / .venv / __pycache__ / .pytest_cache / .toolchain (Go toolchain copies codex downloads) / compiled binaries / git metadata of nested repos. Final archive: 17MB. Direct lineage from this archive into Aegis code: | Round | Surfaced | Fixed in | |---|---|---| | Round 8 codex | SEC010 FP on `secrets.choice` | PR #9 | | Round 9 Go / Java | SEC009 multi-language coverage = 0 | PR #12 | | Round 9 Java | SEC010 inner-block `break` hid production case | PR #11 | Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

#14) - Main README.md and README.zh-TW.md: new "Experiments archive" / 「實驗資料」 section linking to experiments/README.md - experiments/README.md: replaces the prose-only intro with four analysis charts: 1. Round structure overview (52 dirs, 26 paired, 11 models) 2. Plan B brownfield 0/3 → 3/3 fix-rate matrix with per-model remaining-bug counts (real numbers from re-validating the archive against current rule library) 3. Plan C multi-module task-completion matrix surfacing the anti-paralysis ROI mechanism (g54mini-mc-a abandoned vs g54mini-mc-b completed same task) 4. Mermaid flowchart of the three Aegis ROI mechanisms (rule-hit → fix; structural guardrail; anti-paralysis ritual) - Plus a "Direct lineage" chart connecting each experiment finding to the PR that fixed it (Round 8 → PR #9; Round 9 → PR #11/#12). Also re-imports `experiments/31flashlite-amb-b/` as plain files — the previous commit accidentally captured it as a gitlink because codex had created a nested `.git/` inside the round dir during the agent run. Removed the nested `.git/` and re-added the 6 actual deliverable files. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

wei9072 merged commit 005e170 into main May 6, 2026
1 check passed

wei9072 deleted the fix/sec010-secrets-fp branch May 6, 2026 02:08

wei9072 mentioned this pull request May 6, 2026

feat(aegis-core): Layer 1 per-file fact cache + SEC010 multi-language dispatch #10

Merged

7 tasks

This was referenced May 6, 2026

fix(security): SEC010 reads function name + walks past inner blocks #11

Merged

feat(security): SEC009 language-aware dispatch (PR #12) #12

Merged

docs(experiments): archive 9 rounds of A/B agent experiments #13

Merged

wei9072 mentioned this pull request May 7, 2026

docs: link experiments archive + add analysis charts #14

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): SEC010 FP on secrets.choice / os.urandom#9

fix(security): SEC010 FP on secrets.choice / os.urandom#9
wei9072 merged 1 commit into
mainfrom
fix/sec010-secrets-fp

wei9072 commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wei9072 commented May 6, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant