fix(security): SEC010 FP on secrets.choice / os.urandom#9
Merged
Conversation
Round 8 codex experiment surfaced an FP that PR #6's SEC010 expansion introduced (or made far more visible): `secrets.choice` on a character pool was getting flagged as weak RNG, even though `secrets.choice` is the SECURE choice — the same `choice` function name shared with `random.choice` was matching unconditionally. The shape that fired falsely: import secrets, string ALPHANUMERIC = string.ascii_letters + string.digits def make_code(): return "".join(secrets.choice(ALPHANUMERIC) for _ in range(8)) Receiver module is `secrets`, not `random` — should pass. Fix: at the top of `check_weak_random_for_token`, check the full call name against an allowlist of known-cryptographic RNG modules (`secrets.`, `os.urandom`, `crypto.`, `nacl.`, `subtle.`, `rand.read`/`rand.int` for Go crypto/rand, `window.crypto.`, `self.crypto.`). Match anywhere in the name to handle dotted / slashed forms. Tests: 2 new — `secrets.choice` over ascii pool (was the FP from the experiment), `os.urandom` for token (different shape, same class). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
7 tasks
wei9072
added a commit
that referenced
this pull request
May 6, 2026
… dispatch (#10) * feat(aegis-core): ast::imports — per-file Import struct + extractor Introduces the first per-file Layer 1 fact-extraction module. Every language adapter declares an `import_query` capturing module names as `@import`; until now signals/imports_local.rs and workspace.rs each ran that query independently. This pulls the common pattern into one place so future per-file fact derivation (security receiver resolution, public symbols, etc.) has the same home. `extract_imports(parsed) -> Vec<Import>` runs the adapter query once and returns normalized module strings with line numbers. Pure function; caching is the next commit's job (`ParsedFile.imports()` lazy cache). 5 tests cover Python plain/from-imports, Go quoted module paths, JS/TS quote stripping, line numbering (1-indexed), and the contract that callers must hold a ParsedFile (unparseable extensions can't reach this code path). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(aegis-core): ParsedFile.imports() lazy cache + resolve_receiver() Adds an `OnceCell<Vec<Import>>` to ParsedFile so the import query runs at most once per file. Single-threaded by design — ParsedFile is owned by gather_findings and not shared across threads, so std::cell::OnceCell suffices. New API: - `ParsedFile::imports() -> &[Import]` — lazy cache populated on first call. - `ParsedFile::resolve_receiver(name) -> Option<&Import>` — best- effort receiver-to-import lookup. Today: last-segment of module path or full module name. Alias-aware resolution (Go `import myrand "math/rand"`) is intentionally deferred — needs language-specific AST shapes. 5 new tests cover cache population, cache stability across calls (identical slice pointer), Python module-name resolution, Go last-segment path resolution (`math/rand` → `rand`), and the empty-string boundary. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(aegis-core): migrate consumers to ParsedFile.imports() cache Both signals/imports_local.rs and workspace.rs::summarize_file used to run their own copy of: let query = Query::new(lang, adapter.import_query()).ok()?; let mut qc = QueryCursor::new(); for m in qc.matches(&query, parsed.root_node(), src) { for cap in m.captures { ... adapter.normalize_import(text) ... } } That work now lives in ParsedFile.imports() and runs at most once per file. Both call sites collapse to a single iterator over `parsed.imports()`. Behavioural delta: none. Both consumers were already de-duplicating on the module string (HashSet in workspace.rs, line-counting in imports_local.rs); the cache returns a `Vec<Import>` with each distinct (module, line) pair, so consumer behaviour is preserved. Drops three unused tree-sitter type imports from workspace.rs (Parser, Query, QueryCursor) — they were only used by the now-gone inline import-extraction. The pre-existing unused `ParsedFile` import in findings.rs is separate scope; left alone. 142 / 142 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(security): SEC010 language-aware dispatch over Layer 1 imports Replaces the polyglot needle list (`choice` | `randint` | ... | `Math.random`) with per-language matchers. Each language's matcher hard-codes the receiver path, so safe APIs that share function names (`secrets.choice` shares `choice`, `SecureRandom.nextInt` shares `nextInt`, `crypto/rand.Read` shares `Read`) never match. Per-language coverage: - **Python**: `random.X` only (`X` ∈ choice/choices/randint/sample/ uniform/randrange/shuffle/random). Excludes `secrets.X`, `os.urandom`, `np.random.X`, `random.SystemRandom().X`. - **JS / TS**: `Math.random` only. - **Go**: `math/rand` method set, with import-resolution via `parsed.resolve_receiver()` to disambiguate from `crypto/rand`. `import "math/rand"` → `rand.Intn` flags; `import "crypto/rand"` → `rand.Read` doesn't (and `Intn`/`Int31`/etc. don't even exist there). Falls back to method-name conservative default when no matching import is in the file. - **Java / Kotlin**: `nextInt`/`nextLong`/etc. on `Random` (or unqualified). `SecureRandom`-prefixed receivers excluded by receiver-path filter. - **C#**: `Next`/`NextDouble`/`NextBytes`. Excludes `RandomNumberGenerator` and any `Cryptography`-namespace receiver. - **PHP**: global `rand`/`mt_rand`/`array_rand`. `random_int` and `random_bytes` are CSPRNG and intentionally absent. - **Rust**: `gen`/`gen_range`/`gen_bool` from the rand crate. `OsRng` excluded by receiver-path filter. Drops PR #9's `safe_module_prefixes` whitelist hack — now redundant. Each language's matcher hard-codes the receiver path, so `secrets.choice` never reaches the SEC010 path in the first place because `is_python_weak_rng` requires `first == "random"`. Threads `&ParsedFile` down through `walk()` so language identity and import facts are available at the SEC010 call site. Other SEC checks unchanged. Existing 142 tests pass without modification — the dispatch preserves semantics for everything that was already flagging. Per-language regression tests added in the next commit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(security): SEC010 per-language regression coverage 10 new tests covering the dispatch matrix from PR #10 step 4: Python: - random.SystemRandom() CSPRNG-backed → does NOT fire - numpy np.random.choice (statistical sampling) → does NOT fire JS / TS: - Math.random for token (chained .toString[2] form) → fires - window.crypto.getRandomValues → does NOT fire Go (with Layer 1 import resolution): - import "math/rand" + rand.Intn → fires - import "crypto/rand" + rand.Read → does NOT fire (same literal `rand.Read` text, different module per import resolution) Java: - new Random().nextInt(...) literal → fires - new SecureRandom().nextInt(...) literal → does NOT fire Note: variable-stored (`Random r = new Random(); r.nextInt`) is a known FN — without dataflow we can't recover the class identity from the receiver `r`. Worth less than the FP risk on `SecureRandom r; r.nextInt`. C# / PHP: - new Random().Next(...) → fires - PHP rand($min, $max) → fires - PHP random_int($min, $max) (CSPRNG) → does NOT fire Also fixes call-name extraction for Java's `method_invocation`, which lacks a `function` field — composes `object.name` from the two child fields so the receiver path is visible to the matcher. And extends `enclosing_token_context` to recognize Go (var_spec / short_var_declaration), Java (local_variable_declaration), and JS augmented assignment kinds, so the token-context detection works across the languages now in scope. 153 / 153 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This was referenced May 6, 2026
wei9072
added a commit
that referenced
this pull request
May 7, 2026
Adds the empirical validation work that drove PRs #6 — #12 into the repo as a reproducibility archive under `experiments/`. Contents: - `comparison-report.md` (1199 lines, rolling Round 1 → Round 9 analysis). Documents the three Aegis ROI mechanisms surfaced from data: 1. Rule-hit → fix (brownfield Plan B: 0/3 → 3/3 across 3 models) 2. Structural guardrail (cycle / public_symbol_removed — 0/14 hits, dead weight on clean architectures) 3. Anti-paralysis ritual (weak models complete tasks they would otherwise abandon) - 4 starting-code fixtures (Python brownfield, Go brownfield, Java brownfield, Python multi-module). - 11 prompt files. Each task has paired `-a.txt` (no Aegis) and `-b.txt` (with Aegis MCP + REQUIRED-workflow ritual instruction). - 52 round directories: per-model deliverables + `run.log` for codex-driven rounds. Naming: <model>-<task>-<a|b>. - `aegis_validate.py` — Python wrapper around aegis-mcp stdio JSON-RPC, used by agents to run validation after each write. - 3 eval scripts that diff each round's deliverables against the planted SEC bugs. Excluded via `.gitignore` and rsync filter (would have been ~970MB of bloat): venv / .venv / __pycache__ / .pytest_cache / .toolchain (Go toolchain copies codex downloads) / compiled binaries / git metadata of nested repos. Final archive: 17MB. Direct lineage from this archive into Aegis code: | Round | Surfaced | Fixed in | |---|---|---| | Round 8 codex | SEC010 FP on `secrets.choice` | PR #9 | | Round 9 Go / Java | SEC009 multi-language coverage = 0 | PR #12 | | Round 9 Java | SEC010 inner-block `break` hid production case | PR #11 | Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
wei9072
added a commit
that referenced
this pull request
May 7, 2026
#14) - Main README.md and README.zh-TW.md: new "Experiments archive" / 「實驗資料」 section linking to experiments/README.md - experiments/README.md: replaces the prose-only intro with four analysis charts: 1. Round structure overview (52 dirs, 26 paired, 11 models) 2. Plan B brownfield 0/3 → 3/3 fix-rate matrix with per-model remaining-bug counts (real numbers from re-validating the archive against current rule library) 3. Plan C multi-module task-completion matrix surfacing the anti-paralysis ROI mechanism (g54mini-mc-a abandoned vs g54mini-mc-b completed same task) 4. Mermaid flowchart of the three Aegis ROI mechanisms (rule-hit → fix; structural guardrail; anti-paralysis ritual) - Plus a "Direct lineage" chart connecting each experiment finding to the PR that fixed it (Round 8 → PR #9; Round 9 → PR #11/#12). Also re-imports `experiments/31flashlite-amb-b/` as plain files — the previous commit accidentally captured it as a gitlink because codex had created a nested `.git/` inside the round dir during the agent run. Removed the nested `.git/` and re-added the 6 actual deliverable files. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Round 8 of the codex agent experiment (gpt-5.4-mini × Plan A
ambiguous spec) surfaced a false positive that PR #6's SEC010
expansion introduced (or made far more visible): the SECURE choice,
`secrets.choice`, was being flagged as weak RNG.
The bug:
`secrets.choice` shares the function name `choice` with
`random.choice`. The match was on the bare last segment, so both
fired. Combined with PR #6's opaque-id heuristic (which detects
`string.ascii_letters` in the enclosing function), every URL
shortener using `secrets` got flagged as if it were using
`random`. Active danger: the agent might "fix" secure code into
something worse.
Fix: add a known-secure module allowlist at the top of
`check_weak_random_for_token`:
`subtle.`
If the call name contains any of these, return early — even if the
last segment matches a weak-RNG name.
Test plan
regression tests)
now silent
for token still flagged)
🤖 Generated with Claude Code