From 024435f56c43ac92d9024f42263151e568275c8d Mon Sep 17 00:00:00 2001 From: "Marcus R. Brown" Date: Tue, 23 Jun 2026 14:37:33 -0700 Subject: [PATCH 1/7] docs: brainstorm and plan the failed-then-fixed capture trigger Add the second capture trigger: PRs whose CI failed then passed before merge feed the existing propose-only pipeline. Transition detection is one GraphQL query per PR; the fixing diff is the primary signal with best-effort failing logs; a new secret/redaction scan layers on the private-name gate because logs and diffs leak more than repo names; a discriminated candidate union keeps dedup, cap, and privacy shared across both triggers. --- ...-failed-then-fixed-capture-requirements.md | 147 +++++++ ...-feat-c2-failed-then-fixed-capture-plan.md | 361 ++++++++++++++++++ 2 files changed, 508 insertions(+) create mode 100644 docs/brainstorms/2026-06-22-c2-failed-then-fixed-capture-requirements.md create mode 100644 docs/plans/2026-06-22-004-feat-c2-failed-then-fixed-capture-plan.md diff --git a/docs/brainstorms/2026-06-22-c2-failed-then-fixed-capture-requirements.md b/docs/brainstorms/2026-06-22-c2-failed-then-fixed-capture-requirements.md new file mode 100644 index 000000000..d16eb030f --- /dev/null +++ b/docs/brainstorms/2026-06-22-c2-failed-then-fixed-capture-requirements.md @@ -0,0 +1,147 @@ +--- +title: 'C2 — Failed-then-fixed capture trigger' +date: 2026-06-22 +status: ready +scope: standard +kind: requirements +parent: docs/brainstorms/2026-06-22-skill-saving-grow-and-learn-requirements.md +--- + +# C2 — Failed-then-fixed capture trigger + +## Purpose + +Add the second capture trigger to the grow-and-learn pipeline. Today the capture run learns +only from PRs that went through multiple substantive review rounds. This adds PRs whose CI +**failed then passed before merge** — a concentrated mistake→correction signal — feeding the +same propose-only pipeline. The parent doc deferred this trigger to "Phase 2.5, once +multi-round-review precision is proven"; that precondition is met (the trigger shipped, +validated live, and review-prose enrichment raised proposal quality). + +## The vision in one sentence + +When a pull request breaks CI and a fix turns it green before merge, Fro Bot captures the +"this failure → this fix" lesson as a proposed learning, the same way it already learns from +review-heavy PRs. + +## Problem frame + +A CI failure that gets fixed is a clean, structured mistake→correction event: a named check +went red, a diff turned it green. That is exactly the kind of concrete signal the +review-prose enrichment proved produces good learnings — and exactly what the multi-round +trigger's title-only first cut lacked. The data is queryable: a PR's check-run history shows +the fail→pass transition, the failing run's logs carry the actual error, and the fixing +commits carry the diff. C3 (issue triage) is deferred because its resolution rationale lives +in free-text comments with no clean structured signal — the weak-signal trap the enrichment +follow-up (#3552) flagged. + +## Accepted risk — value gate + +C1's signal *precision* is proven (it finds real review-heavy PRs and authors specific, +useful learnings, and Phase 1 retrieval live-fired — Fro Bot cited a prior learning on a real +issue). What is **not** yet measured is corpus *adoption* over time: do the authored docs keep +getting retrieved and applied across many future runs? C2 broadens the "grow" half before that +slow signal is in. This is an accepted risk: C2's failed-then-fixed is an independently clean, +high-signal source (not speculative), and the cost of a second proven-clean trigger is low. +If corpus adoption later proves weak, the lever to revisit is the retrieval/authoring side, not +the number of capture triggers. + +## Requirements + +- R1. The capture run identifies merged PRs (in the existing lookback window, this repo only) + whose required checks transitioned **failed → passed** before merge — a fail→fix event. +- R2. For each such PR, the digest carries: the failed check name(s), a **failing-step log + excerpt** (ranked toward the error lines, budgeted per candidate), and the **fixing diff** + (the change that turned the check green). This is the agent's primary signal — it distills + "this failure → this fix." +- R3. The new digest content (logs and diffs) passes the existing fail-closed upstream + private-repo-name scan **plus a stronger log/diff-specific scan**, applied before the content + reaches the agent. CI logs and diffs are a wider leak surface than review prose: they can + carry secrets/tokens (masked or not), internal hostnames/URLs, environment values, and file + paths that the private-repo-name token scan does not catch — and the agent authors a + **public** issue from this content. The log/diff scan must detect and redact (or drop on) + secret-shaped strings, internal URLs/hostnames, and credential-shaped content, not only + tracked private-repo-name tokens. Never unscanned log/diff content reaches the agent. +- R3a. On a privacy/secret hit in log/diff content, the safe default is to **drop the enriched + log/diff for that candidate**; the candidate may proceed only on already-clean signal + (e.g. the failed check name, which is not sensitive). The fallback signal must itself be + proven non-sensitive — a candidate must never reach the public-issue step carrying + unsanitized log/diff content. +- R4. A PR is proposed **at most once** regardless of how many triggers it matches — a PR that + was both review-heavy and failed-then-fixed yields one learning-proposal. The existing + merge-SHA body marker already dedups by merge SHA; C2 candidates dedup against the same set. +- R5. C2 reuses the existing pipeline end to end: the opaque digest contract, the shared + privacy module, the merge-SHA marker, the `learning-proposal` label, the per-run cap, and + the capture workflow. It is a **new candidate source**, not a new pipeline. +- R6. Telemetry stays counts-only (no owner/name/path in logs); the harvest summary gains a + per-trigger candidate count so an operator can see C2's contribution. + +## Scope boundaries + +- C2 (failed-then-fixed) only. This repo only, same lookback as the existing harvest. +- "Failed then passed" is a transition on the PR's checks before merge — not a flaky single + re-run with no diff. A fix with a real diff is required (a pure re-run that flips green with + no code change is not a learning). +- No change to the propose-only model, the privacy contract, the dedup marker, or the cap. + +### Deferred + +- **C3 — Issue triage.** Murky free-text signal; revisit after C2 proves failed-then-fixed + learning quality, and only with a structured proxy (e.g. issues closed by a linked PR). +- **C4 — Cross-run synthesis.** Phase 3. +- Cross-repo C2 (other fro-bot repos): later; v1 is this repo only. + +## Open questions (for planning) + +- **Q1 — Transition detection mechanism. THE LOAD-BEARING RISK.** Detecting a fail→pass + transition is not a single cheap query: GitHub's check-runs / commit-status APIs report state + **per head SHA**, not a PR-level history, so a transition must be reconstructed by correlating + check state across a PR's successive head SHAs (an O(merged-PRs × pushes) fan-out). Planning + must settle: which API (check-runs per SHA vs. commit-status timeline), how to bound the cost, + how "required checks" are determined at merge time (branch-protection required contexts are + current-state config, not necessarily historical), and how to require a real fixing diff (not + a bare green re-run). If this proves too expensive or unreliable, C2's signal needs rethinking + before build — this question gates the trigger. +- **Q1a — Log retention.** Actions logs are retention-bounded (default ~90 days, sometimes + less), so a PR inside the lookback window may have purged failing logs. Planning must define + the degrade/skip rule when logs are unavailable (e.g. proceed on diff + check name only, or + skip the candidate) — the log excerpt cannot be assumed always fetchable. +- **Q2 — Failing-log fetch + ranking.** Which failing run/job/step to pull, how to rank the + excerpt toward the error lines (mirror the correction-first ranking from enrichment), and the + per-candidate budget. Logs/diffs are much larger than review prose — truncate to budget + **before** the privacy/secret scan so the scan and the agent never see the full volume. +- **Q3 — Fixing-diff scope.** The diff "that fixed it" — the commits between the last failing + SHA and the first passing SHA — vs. the whole PR diff. The narrower fixing-commit range is + the sharper signal but needs the transition SHAs from Q1 (downstream of the hard part). +- **Q4 — Evidence model.** The existing `Candidate` shape is review-specific (`reviewRounds`, + `reviewExcerpts`). Planning must decide whether C2 widens it to a shared evidence model or + overloads those fields — "new source, not new pipeline" still requires a candidate shape that + isn't review-only. + +(Resolved by the requirements, not open: privacy hit → drop enriched content, keep clean +signal (R3a); per-run cap is the existing shared budget, not per-trigger (R5).) + +## Success criteria + +- **SC1** — A capture run identifies a real failed-then-fixed PR in the window and emits a + candidate carrying the failed check name, a log excerpt, and the fixing diff. +- **SC2** — The new log/diff content passes the upstream scan fail-closed for BOTH a tracked + private-repo-name token AND secret-shaped content with no repo-name present: fixtures with a + secret/token-shaped string, an internal hostname/URL, and a credential-shaped string in a diff + are each dropped/redacted (mutation-proven), not just the private-repo-name case. +- **SC3** — A PR matching both triggers yields exactly one learning-proposal (dedup holds). +- **SC4** — A produced proposal distills a specific "failure → fix" lesson grounded in the + log/diff, not the PR title (the enrichment-quality bar). +- **SC5** — Telemetry shows the per-trigger candidate count; logs remain counts-only. + +## Sources & references + +- Parent: docs/brainstorms/2026-06-22-skill-saving-grow-and-learn-requirements.md (Phase 2.5 + deferred C2/C3; "add once multi-round-review precision is proven") +- Plan: docs/plans/2026-06-22-002-feat-capture-c1-proposals-plan.md (the propose-only pipeline) +- Plan: docs/plans/2026-06-22-003-feat-enrich-capture-digest-plan.md (review-prose enrichment + + the upstream privacy-gate pattern C2 reuses) +- Code: scripts/capture-learnings-harvest.ts (the harvest to extend with a second source), + scripts/capture-learnings-privacy.ts (the shared fail-closed gate), scripts/capture-learnings-open.ts, + .github/workflows/capture-learnings.yaml +- Follow-up #3552 (digest-quality finding that motivates evidence-rich C2 content) diff --git a/docs/plans/2026-06-22-004-feat-c2-failed-then-fixed-capture-plan.md b/docs/plans/2026-06-22-004-feat-c2-failed-then-fixed-capture-plan.md new file mode 100644 index 000000000..9ca0fee82 --- /dev/null +++ b/docs/plans/2026-06-22-004-feat-c2-failed-then-fixed-capture-plan.md @@ -0,0 +1,361 @@ +--- +title: 'feat: C2 failed-then-fixed capture trigger' +type: feat +status: active +date: 2026-06-22 +origin: docs/brainstorms/2026-06-22-c2-failed-then-fixed-capture-requirements.md +--- + +# feat: C2 failed-then-fixed capture trigger + +## Overview + +Add a second candidate source to the learning-capture pipeline: merged PRs whose CI checks +transitioned **failed → passed** before merge. For each, the digest carries the fixing diff +(primary signal, always available) plus a best-effort failing-log excerpt, privacy-scanned — +including a new secret/redaction scan because logs and diffs leak more than repo names. Both +triggers share the existing propose-only pipeline (opaque digest → agent authors → privacy +gate → `learning-proposal` issue); a discriminated `Candidate` union keeps dedup, cap, and +privacy uniform. This broadens the "grow" half of grow-and-learn with one more clean, +high-signal trigger (see origin). + +## Problem Frame + +A CI failure that gets fixed before merge is a concentrated mistake→correction event: a named +check went red, a diff turned it green. That structured signal is exactly what produced good +learnings from review prose, and what the title-only first cut lacked. Today the capture run +only learns from review-heavy PRs; C2 adds the failed-then-fixed source. The origin doc gated +C2 on "multi-round-review precision proven," which is met; the accepted value-gate risk +(corpus adoption not yet measured) is recorded in the origin. + +## Requirements Trace + +- R1. Identify merged PRs (same lookback, this repo) whose checks transitioned failed→passed + before merge (see origin R1). +- R2. The digest carries the fixing diff (primary) + best-effort failing-log excerpt + failed + check name(s); the agent distills "this failure → this fix" (origin R2). +- R3. New log/diff content passes the existing private-repo-name scan **plus a new + secret/URL/path scan** before reaching the agent (origin R3). +- R3a. On a secret/private hit: redact in the harvest excerpt where safe, else drop the + enriched content; the candidate may proceed only on already-clean signal; the open-step body + scan blocks any residual secret (origin R3a). +- R4. One learning-proposal per merge SHA — a PR matching both triggers yields one proposal + (origin R4); the existing merge-SHA marker dedups. +- R5. Reuse the pipeline end to end: opaque digest, shared privacy module, marker, label, cap, + workflow — a new candidate source, not a new pipeline (origin R5). +- R6. Counts-only telemetry; the harvest summary gains per-trigger candidate counts (origin R6). + +## Scope Boundaries + +- C2 (failed-then-fixed) only, this repo only, same `LOOKBACK_DAYS = 30`. +- "Failed then passed" requires a real fixing diff (not a bare green re-run with no change). +- No change to the propose-only model, dedup marker, or per-run cap (shared budget). + +### Deferred to Separate Tasks + +- C3 (issue triage) — murky free-text signal; revisit with a structured proxy after C2 proves out. +- C4 (cross-run synthesis) — Phase 3. +- Cross-repo C2 — later; v1 is this repo. +- High-entropy generic-blob secret detection — too noisy for v1 (origin defers; add later with a context scorer). + +## Context & Research + +### Relevant Code and Patterns + +- `scripts/capture-learnings-harvest.ts` — `harvestCandidates` (I/O shell), `buildCandidateDigest` + (pure core: filter + cap + privacy-scan + emit), `Candidate` interface (currently + review-specific: `{mergeSha, reviewRounds, signals, reviewExcerpts}`), `buildReviewExcerpts` + (truncate-then-scan privacy-ordering invariant), `LOOKBACK_DAYS = 30`, `MAX_LEARNINGS_PER_RUN = 5`, + `HarvestStageCounts`. The opacity test asserts emitted candidates carry exactly the allowed keys. +- `scripts/capture-learnings-privacy.ts` — the shared gate: `learningBodyHasPrivateLeak(body, tokens)` + (pure, lowercase `includes`), `loadPrivateTokensFromDisk` (I/O, fail-closed throw). Both + chokepoints (harvest + open) import this module. The new secret scan lives here. +- `scripts/wiki-slug.ts` — `buildPrivateTokenSet` (owner/name token forms). +- `scripts/check-private-leak.ts` — `redactPathTokens(path, tokens)` (case-insensitive regex + replace → `[REDACTED]`); the established `[REDACTED]` marker. Copy this shape for redaction. +- `scripts/private-repo-resolution.ts` (`execFileSync('gh', ['api', 'graphql', ...])`) and + `scripts/reconcile-repos.ts` — the established `gh api graphql` pattern (no `@octokit/graphql` dep). +- `.github/workflows/capture-learnings.yaml` — the harvest step, the trigger-aware agent prompt, + the step summary, the metadata overlay (needed for the privacy token set). + +### Institutional Learnings + +- `docs/solutions/security-issues/verify-whole-public-perimeter-2026-06-22.md` — enumerate every + public surface before declaring non-leaking; C2 adds "agent input digest" and "harvest run logs" + to that surface list. +- `docs/solutions/best-practices/pure-core-privacy-gates-shared-module-2026-06-22.md` — the new + secret scan must be pure-core in the shared module, with a mutation-proof test that goes red when + the gate is removed; two chokepoints import one module so they can't diverge. +- `docs/solutions/best-practices/privacy-gate-promotion-leak-prevention-2026-06-04.md` — "scan + content, not just identifiers"; fail-closed on ambiguous resolution. + +## Key Technical Decisions + +- **Transition detection: one GraphQL query per candidate PR.** `pullRequest.commits(first:100) + .nodes.commit.statusCheckRollup.contexts(first:100).nodes{... CheckRun{name conclusion} ... + StatusContext{context state}}` returns per-commit per-check conclusions across the PR's SHA + history in a single call (proven pattern). Walk commits oldest→newest: last-failing SHA = latest + commit where the target check is FAILURE/TIMED_OUT/etc.; first-passing SHA = first SUCCESS after + it. Invoked via `execFileSync('gh', ['api', 'graphql', ...])` with the workflow token — no new dep. +- **Required-checks: current branch-protection set, strict, with a documented approximation.** + One `getBranchProtection` call per run yields `required_status_checks.checks[].context`; match + fail→pass only against that set. Caveat (in code): it's current config, not merge-time state. This + repo currently enforces none, so effectively any failed→passed check counts — the abstraction + generalizes to repos that do enforce. +- **Fixing diff is the primary signal; logs are best-effort.** `repos.compareCommits(lastFailingSha, + firstPassingSha)` gives the fixing change and is always available. The failing-log excerpt + (`actions.downloadJobLogsForWorkflowRun` for the failed job) degrades to a `[failure log purged]` + placeholder on 404/410 or when unavailable — the candidate never blocks on log availability. +- **Discriminated `Candidate` union.** `Candidate = ReviewCandidate | CiFixCandidate`, sharing + `mergeSha` + `signals`, with a `trigger` discriminant and a per-source `evidence` block. This + keeps the opacity guarantee (exact allowed keys per trigger), lets the agent prompt branch, and + avoids overloading review-specific fields with CI data. +- **New secret/redaction scan in the shared privacy module.** Add `logDiffHasSecret(body)` (block) + and `redactLogDiffSecrets(body)` (structural redact → `[REDACTED]`). Block on PATs + (`gh[pousr]_…`, `github_pat_…`), private-key blocks, credential-bearing connection strings, and + cloud/LLM key shapes; redact file paths (`/home`, `/Users`, `~/.ssh`, …), internal hostnames + (`*.fro.bot`), and `Bearer`/`Authorization` values. The existing private-repo-name scan still runs. +- **Per-run cap stays shared.** Both sources compete for `MAX_LEARNINGS_PER_RUN`; no per-trigger budget. + +## Open Questions + +### Resolved During Planning + +- Transition detection mechanism/cost → 1 GraphQL query per PR (resolved; the gating risk). +- Required-checks determination → current branch-protection set, strict, documented approximation. +- Log retention → best-effort fetch with `[purged]` degrade; diff is the primary signal. +- Evidence model → discriminated union with per-source `evidence`. +- Privacy hit policy → redact-where-safe in harvest, block residual in open (origin R3a). +- Cap → shared budget. + +### Deferred to Implementation + +- Exact GraphQL field selection / pagination handling for PRs with >100 commits (cursor walk). +- The precise secret-pattern regex set and the redact-vs-block threshold per pattern (research + gives a defensible v1 list; finalize against real fixtures during implementation). +- Whether to extract `LOOKBACK_DAYS` to a shared constants module or import from the harvest file. +- Exact `HarvestStageCounts` field names for the per-trigger + secret-blocked counters. + +## High-Level Technical Design + +> *This illustrates the intended approach and is directional guidance for review, not +> implementation specification. The implementing agent should treat it as context, not code.* + +``` +harvest (I/O shell) + ├─ harvestReviewHeavyCandidates(...) → Candidate[] (existing, unchanged) + └─ harvestCiFixCandidates(...) → Candidate[] (new) + per merged PR in window: + 1 GraphQL query → per-commit per-check conclusions + find lastFailingSha / firstPassingSha for a required check + compareCommits(lastFailingSha, firstPassingSha) → fixing diff + downloadJobLogsForWorkflowRun(failed job) [best-effort] → log excerpt | [purged] + build CiFixEvidence { trigger:'ci-fail-then-pass', lastFailingSha, firstPassingSha, + failingCheckName, diffExcerpt, logExcerpt? } + ↓ concat both arrays +buildCandidateDigest (pure core) — generalized over the union + dedup by mergeSha · solutions dedup by signals · cap · PRIVACY: + private-repo-name scan + NEW secret scan (redact-where-safe / drop-else) + ↓ opaque digest (merge SHA + per-trigger evidence only — no owner/repo/number) +agent authors bodies (prompt branches on trigger) + ↓ +open step — body privacy scan (private-name + secret block) → learning-proposal issue +``` + +## Implementation Units + +- [ ] **Unit 1: Secret/redaction scan in the shared privacy module** + +**Goal:** Add log/diff secret detection + structural redaction to the shared gate, callable from +both chokepoints, without touching the existing private-repo-name scan. + +**Requirements:** R3, R3a. + +**Dependencies:** None. + +**Files:** +- Modify: `scripts/capture-learnings-privacy.ts` +- Test: `scripts/capture-learnings-privacy.test.ts` + +**Approach:** +- Add pure `logDiffHasSecret(body): boolean` (block patterns) and `redactLogDiffSecrets(body): string` + (redact patterns → `[REDACTED]`, mirroring `redactPathTokens` in `check-private-leak.ts`). Keep + them pure; no I/O. Block list and redact list per the KTD. The existing + `learningBodyHasPrivateLeak` / `loadPrivateTokensFromDisk` stay unchanged. + +**Execution note:** Test-first; the mutation-proof test is the load-bearing coverage. + +**Patterns to follow:** `learningBodyHasPrivateLeak` (pure shape), `redactPathTokens` +(`check-private-leak.ts`) for the redact form, the `[REDACTED]` marker convention. + +**Test scenarios:** +- Happy: a clean diff/log → `logDiffHasSecret` false, `redactLogDiffSecrets` unchanged. +- Block: bodies containing a `ghp_…` PAT, a `github_pat_…` token, a private-key block, and a + credential-bearing connection string each → `logDiffHasSecret` true. +- Redact: a `/Users/...` path, a `*.fro.bot` hostname, and a `Bearer ` each → redacted to + `[REDACTED]` with surrounding prose preserved. +- Edge: a placeholder/short value (`token=x`) → not over-redacted; an empty body → false/unchanged. +- Mutation proof: with the secret scan disabled/short-circuited, a `ghp_` token reaches the + output → the test fails (proves the gate is load-bearing). + +**Verification:** gates green; the mutation-proof test bites; no secret value appears in any +test assertion message. + +- [ ] **Unit 2: Discriminated Candidate union + generalized digest core** + +**Goal:** Refactor `Candidate` into a `trigger`-discriminated union with per-source `evidence`, +and generalize `buildCandidateDigest` to iterate the union uniformly (dedup, cap, privacy), with +no behavior change for the existing review source. + +**Requirements:** R4, R5. + +**Dependencies:** Unit 1 (the digest privacy step calls the new scan). + +**Files:** +- Modify: `scripts/capture-learnings-harvest.ts` +- Modify: `scripts/capture-learnings-open.ts` (consume the union's shape if needed) +- Test: `scripts/capture-learnings-harvest.test.ts`, `scripts/capture-learnings-open.test.ts` + +**Approach:** +- Introduce `ReviewCandidate` / `CiFixCandidate` sharing `{mergeSha, trigger, signals}` with a + per-source `evidence`. Keep the review path producing the same emitted shape. Generalize + `buildCandidateDigest`: dedup by `mergeSha`, solutions dedup by `signals`, cap, then privacy — + private-name scan + the Unit 1 secret scan over the evidence text (redact-where-safe in the + harvest excerpt; clear + count when redaction would gut it). Preserve the truncate-then-scan + privacy-ordering invariant. Update the opacity test to assert exact allowed keys per trigger. + +**Execution note:** Characterization-first — lock the existing review-source emitted shape with a +test before refactoring, so the union change provably preserves it. + +**Patterns to follow:** the existing `buildCandidateDigest` dedup/cap/privacy flow; the opacity +test; `buildReviewExcerpts` ordering invariant. + +**Test scenarios:** +- Happy: a review candidate still emits the same keys/values post-refactor (characterization). +- Happy: a CI-fix candidate emits `trigger:'ci-fail-then-pass'` + its evidence keys, no + owner/repo/number (opacity). +- Edge: a PR matching both triggers → exactly one candidate (dedup by merge SHA), with a defined + trigger precedence. +- Privacy: a CI-fix candidate whose diff/log carries a private repo name OR a secret → enriched + content redacted/dropped before the digest (mutation-proven), counted in telemetry. +- Cap: review + CI-fix candidates compete for the shared cap; over-cap selection is deterministic. + +**Verification:** gates green; review-source output unchanged; opacity holds for both triggers; +the privacy mutation-proof bites. + +- [ ] **Unit 3: CI fail→pass harvester (GraphQL detector + diff/log evidence)** + +**Goal:** Implement `harvestCiFixCandidates` — detect fail→pass PRs via one GraphQL query each, +derive the fixing diff and best-effort log excerpt, and emit `CiFixCandidate`s. + +**Requirements:** R1, R2. + +**Dependencies:** Unit 2 (the union type). + +**Files:** +- Modify: `scripts/capture-learnings-harvest.ts` +- Test: `scripts/capture-learnings-harvest.test.ts` + +**Approach:** +- One `getBranchProtection` per run → required-checks set. For each merged PR in the window + (reuse the existing merged-PR fetch), run the GraphQL commits/rollup query via + `execFileSync('gh', ['api', 'graphql', ...])`; walk commits oldest→newest to find + last-failing/first-passing SHA for a required check (require both to exist = a real transition). + `compareCommits` for the fixing diff (truncate to budget, ranked toward changed hunks); + best-effort `downloadJobLogsForWorkflowRun` for the failed job's log excerpt (ranked toward + error lines, budgeted), degrading to `[failure log purged]`. Build `CiFixEvidence`. Inject the + `gh`/octokit callers + `now` for testability; counts-only telemetry; per-trigger stage counts. + +**Execution note:** Test-first for the transition-walk logic (the correctness core). + +**Patterns to follow:** `execFileSync('gh', ['api','graphql',...])` in `private-repo-resolution.ts`; +the existing merged-PR fetch + `now` injection; `buildReviewExcerpts` truncate-then-scan ordering. + +**Test scenarios:** +- Happy: a PR with a failing commit then a passing commit on a required check → one CiFixCandidate + with the correct last-failing/first-passing SHAs and a diff excerpt. +- Edge: a PR that never failed → no candidate; a PR that failed and stayed failing → no candidate + (no transition); a check not in the required set → ignored. +- Edge: a bare green re-run with no diff between failing and passing SHA → not a candidate (require + a real fixing diff). +- Error/degrade: logs purged (404/410) → `[failure log purged]` placeholder, candidate still emitted + on the diff; a GraphQL/compare error for one PR degrades that PR, does not abort the run. +- Edge: a PR with >100 commits → cursor pagination walks all commits. +- Privacy: a diff/log carrying a private name or secret → handled by the Unit 1/2 scan (assert the + evidence is scanned before emission). + +**Verification:** gates green; the transition walk is mutation-proven (a fixture where removing the +oldest→newest ordering picks the wrong SHA fails); degrade paths don't abort. + +- [ ] **Unit 4: Workflow wiring + trigger-branched agent prompt** + +**Goal:** Concatenate both candidate sources into the digest, branch the agent prompt on `trigger`, +and surface per-trigger + secret-blocked counts in the step summary. + +**Requirements:** R2, R6. + +**Dependencies:** Unit 3. + +**Files:** +- Modify: `scripts/capture-learnings-harvest.ts` (`main` concatenates both sources) +- Modify: `.github/workflows/capture-learnings.yaml` (prompt branch + step summary) + +**Approach:** +- `main` calls both harvesters and concatenates before `buildCandidateDigest`. Update the agent + prompt: a CI-fix candidate's evidence (failing check + fixing diff + optional log) is the primary + signal; the agent distills "this failure → this fix"; the merge-SHA-only reference rule and all + hard boundaries stay identical. Add per-trigger candidate counts and the secret-blocked count to + the step summary (counts-only). + +**Test expectation:** none — workflow wiring + prompt; validated by actionlint and a manual dispatch +showing both sources in the digest and a CI-fix proposal grounded in the diff. + +**Verification:** actionlint clean; a manual `workflow_dispatch` produces a CI-fix learning-proposal +distilled from the diff (not the title), per-trigger counts visible, no secret/private identifier in +the run log. + +## System-Wide Impact + +- **Interaction graph:** harvest gains a second source (1 GraphQL + 1 compare + best-effort log per + CI-fix candidate, bounded by the cap) and a new secret-scan call in the digest; the open step gains + a secret block; the agent prompt branches. The review source is unchanged. +- **Error propagation:** per-PR GraphQL/compare/log failures degrade that candidate, never abort the + run; the secret/private scan is fail-closed (redact-where-safe, else drop/block). +- **State lifecycle risks:** none new — no persisted state; the decision log (proposal issues) is + untouched; dedup by merge SHA prevents double-proposals across triggers. +- **API surface parity:** both chokepoints (harvest + open) share the one privacy module, so the + private-name and secret scans cannot drift. +- **Unchanged invariants:** the propose-only model, the merge-SHA marker, the `learning-proposal` + label, the per-run cap, opacity (merge-SHA-only references), and the data-branch authority model. + +## Risks & Dependencies + +| Risk | Mitigation | +|------|------------| +| Secret leaks into a public issue via a failing log | New block/redact secret scan upstream + open-step block; mutation-proven; logs are best-effort and diff-primary | +| Transition detection wrong (picks wrong SHA) | Oldest→newest chronological walk, mutation-proven; require both failing and passing SHA to exist | +| Required-checks approximation (current config ≠ merge-time) | Documented caveat; weekly 30-day window makes drift rare; repo currently enforces none so any failed→passed counts | +| Log retention purges the excerpt | `[purged]` degrade; the fixing diff is the primary signal and always available | +| Refactor breaks the existing review source | Characterization test locks the emitted shape before the union refactor | +| Secret-regex false positives over-redact | Defensible v1 pattern set; high-entropy generic detection deferred; finalize against real fixtures | + +## Documentation / Operational Notes + +- After landing, the `logDiffHasSecret` / `redactLogDiffSecrets` block-vs-redact pattern is a strong + `docs/solutions/security-issues/` learning candidate (reusable for any pipeline ingesting + third-party run output) — author via `ce:compound` if it proves reusable. +- Close the C2 requirements doc's follow-ups when this ships. + +## Sources & References + +- **Origin document:** docs/brainstorms/2026-06-22-c2-failed-then-fixed-capture-requirements.md +- Parent: docs/brainstorms/2026-06-22-skill-saving-grow-and-learn-requirements.md +- Plans: docs/plans/2026-06-22-002-feat-capture-c1-proposals-plan.md (pipeline), + docs/plans/2026-06-22-003-feat-enrich-capture-digest-plan.md (enrichment + upstream privacy gate) +- Code: scripts/capture-learnings-harvest.ts, scripts/capture-learnings-privacy.ts, + scripts/capture-learnings-open.ts, scripts/wiki-slug.ts, scripts/check-private-leak.ts + (redactPathTokens), scripts/private-repo-resolution.ts (gh api graphql pattern), + .github/workflows/capture-learnings.yaml +- Learnings: docs/solutions/security-issues/verify-whole-public-perimeter-2026-06-22.md, + docs/solutions/best-practices/pure-core-privacy-gates-shared-module-2026-06-22.md, + docs/solutions/best-practices/privacy-gate-promotion-leak-prevention-2026-06-04.md From 340eaad58daf6fee088bed5047f6117decbbe38d Mon Sep 17 00:00:00 2001 From: "Marcus R. Brown" Date: Tue, 23 Jun 2026 14:50:05 -0700 Subject: [PATCH 2/7] feat(capture): add log/diff secret detection and redaction to the privacy module MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CI logs and diffs leak more than repo names — tokens, keys, connection strings, internal hostnames, and paths. Add a block scan for hard secrets (PATs, private keys, credential-bearing connection strings, cloud/LLM keys) and a structural redaction for paths, internal hostnames, and bearer/authorization values, both pure and callable from the same shared module the existing private-name gate uses. --- scripts/capture-learnings-privacy.test.ts | 410 +++++++++++++++++++++- scripts/capture-learnings-privacy.ts | 101 ++++++ 2 files changed, 510 insertions(+), 1 deletion(-) diff --git a/scripts/capture-learnings-privacy.test.ts b/scripts/capture-learnings-privacy.test.ts index 6646d9198..ce6bf873d 100644 --- a/scripts/capture-learnings-privacy.test.ts +++ b/scripts/capture-learnings-privacy.test.ts @@ -4,6 +4,8 @@ * Structure: * - Pure function tests: `learningBodyHasPrivateLeak` * - Disk loader tests: `loadPrivateTokensFromDisk` (injectable readFile, fail-closed) + * - Secret detection tests: `logDiffHasSecret` (block patterns) + * - Secret redaction tests: `redactLogDiffSecrets` (redact patterns → '[REDACTED]') * * Privacy mutation-proof: each privacy test includes a "without the gate" assertion * that proves removing the check would let the content through. @@ -11,7 +13,12 @@ import {describe, expect, it} from 'vitest' -import {learningBodyHasPrivateLeak, loadPrivateTokensFromDisk} from './capture-learnings-privacy.ts' +import { + learningBodyHasPrivateLeak, + loadPrivateTokensFromDisk, + logDiffHasSecret, + redactLogDiffSecrets, +} from './capture-learnings-privacy.ts' import {buildPrivateTokenSet} from './wiki-slug.ts' // --------------------------------------------------------------------------- @@ -210,3 +217,404 @@ repos: expect(tokens.size).toBe(0) }) }) + +// --------------------------------------------------------------------------- +// logDiffHasSecret — block patterns (pure function tests) +// --------------------------------------------------------------------------- + +describe('logDiffHasSecret', () => { + describe('happy path — clean content', () => { + it('returns false for a clean diff/log string with no secrets', () => { + // #given a clean diff with no secret shapes + const body = ` +diff --git a/scripts/foo.ts b/scripts/foo.ts +index abc..def 100644 +--- a/scripts/foo.ts ++++ b/scripts/foo.ts +@@ -1,3 +1,4 @@ + const x = 1 ++const y = 2 +` + // #when scanning + // #then no secret is detected + expect(logDiffHasSecret(body)).toBe(false) + }) + + it('returns false for an empty body', () => { + // #given an empty body + // #when scanning + // #then no secret is detected + expect(logDiffHasSecret('')).toBe(false) + }) + }) + + describe('block — GitHub PAT (ghp_ / gho_ / ghu_ / ghs_ / ghr_)', () => { + it('detects a ghp_ PAT with 36+ chars', () => { + // #given a body containing a GitHub PAT (ghp_ prefix, 36 chars) + // Using an obviously-fake token — 'A'.repeat(36) is not a real credential + const body = `CI log: token=ghp_${'A'.repeat(36)} was used` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a gho_ OAuth token', () => { + const body = `Authorization: gho_${'B'.repeat(36)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a ghu_ user-to-server token', () => { + const body = `token: ghu_${'C'.repeat(36)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a ghs_ server-to-server token', () => { + const body = `ghs_${'D'.repeat(36)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a ghr_ refresh token', () => { + const body = `refresh_token=ghr_${'E'.repeat(36)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + }) + + describe('block — fine-grained PAT (github_pat_)', () => { + it('detects a github_pat_ fine-grained PAT with 82 chars', () => { + // #given a body containing a fine-grained PAT (github_pat_ prefix, 82 chars) + // Using an obviously-fake token — 'F'.repeat(82) is not a real credential + const body = `export GITHUB_TOKEN=github_pat_${'F'.repeat(82)}` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + }) + + describe('block — private key blocks', () => { + it('detects a BEGIN RSA PRIVATE KEY block', () => { + // #given a body containing a private key header + const body = `-----BEGIN RSA PRIVATE KEY-----\nMIIEowIBAAKCAQEA...\n-----END RSA PRIVATE KEY-----` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a BEGIN PRIVATE KEY block (PKCS#8)', () => { + const body = `-----BEGIN PRIVATE KEY-----\nMIIEvQIBADANBgkq...\n-----END PRIVATE KEY-----` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a BEGIN EC PRIVATE KEY block', () => { + const body = `-----BEGIN EC PRIVATE KEY-----\nMHQCAQEEIBkg...\n-----END EC PRIVATE KEY-----` + expect(logDiffHasSecret(body)).toBe(true) + }) + }) + + describe('block — credential-bearing connection strings', () => { + it('detects a postgres:// connection string with credentials', () => { + // #given a body containing a postgres connection string with user:pass + const body = `DATABASE_URL=postgres://user:pass@host.example.com/db` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a postgresql:// connection string', () => { + const body = `postgresql://admin:s3cr3t@db.internal:5432/mydb` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a mysql:// connection string with credentials', () => { + const body = `mysql://root:password@localhost/schema` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a redis:// connection string with credentials', () => { + const body = `redis://user:token@cache.example.com:6379` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a mongodb:// connection string with credentials', () => { + const body = `mongodb://admin:pass@mongo.example.com:27017/db` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects an amqps:// connection string with credentials', () => { + const body = `amqps://user:pass@rabbit.example.com/vhost` + expect(logDiffHasSecret(body)).toBe(true) + }) + }) + + describe('block — cloud/LLM key shapes', () => { + it('detects an AWS AKIA access key', () => { + // #given a body containing an AWS access key ID + const body = `AWS_ACCESS_KEY_ID=AKIA${'0'.repeat(16)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects an AWS ASIA temporary access key', () => { + const body = `ASIA${'1'.repeat(16)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects an OpenAI sk- key', () => { + // #given a body containing an OpenAI API key shape + const body = `OPENAI_API_KEY=sk-${'a'.repeat(20)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects an Anthropic sk-ant- key', () => { + // #given a body containing an Anthropic API key shape + const body = `ANTHROPIC_API_KEY=sk-ant-${'b'.repeat(20)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a Slack xoxb- bot token', () => { + const body = `SLACK_TOKEN=xoxb-${'1'.repeat(10)}-${'2'.repeat(10)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a Slack xoxp- user token', () => { + const body = `xoxp-${'a'.repeat(10)}-${'b'.repeat(10)}` + expect(logDiffHasSecret(body)).toBe(true) + }) + }) + + describe('block negative — no over-blocking on innocent content', () => { + it('returns false for a body with the word "github" but no secret shape', () => { + // #given a body mentioning github without any secret pattern + const body = 'This PR was merged on github.com/fro-bot/.github' + expect(logDiffHasSecret(body)).toBe(false) + }) + + it('returns false for a body with the word "token" but no secret shape', () => { + // #given a body mentioning token without any secret pattern + const body = 'The token field is documented in the README.' + expect(logDiffHasSecret(body)).toBe(false) + }) + + it('returns false for a body with "sk-" but too short to match', () => { + // #given a body with sk- prefix but only 5 chars (below the 20-char minimum) + const body = 'sk-abc' + expect(logDiffHasSecret(body)).toBe(false) + }) + + it('returns false for a postgres URL without credentials (no user:pass@)', () => { + // #given a postgres URL with no embedded credentials + const body = 'See the postgres docs at https://www.postgresql.org/docs/' + expect(logDiffHasSecret(body)).toBe(false) + }) + }) + + describe('MUTATION PROOF — gate is load-bearing', () => { + it('detects a ghp_ PAT — flipping this function to always return false would fail this test', () => { + // MUTATION PROOF: this test asserts logDiffHasSecret returns true for a real secret shape. + // If logDiffHasSecret were short-circuited to `return false`, this assertion would fail, + // proving the gate is load-bearing. The real cross-module mutation proof comes in Unit 2 + // where the gate is wired into the digest; here we prove the function itself is non-trivial. + // + // Using an obviously-fake token — 'A'.repeat(36) is not a real credential. + const bodyWithGhpToken = `CI output: ghp_${'A'.repeat(36)} was found in env` + + // #when scanning a body that contains a real secret shape + const result = logDiffHasSecret(bodyWithGhpToken) + + // #then the gate must return true — disabling it (return false) makes this fail + expect(result).toBe(true) + + // Complementary: a clean body must return false (proves the function is not always-true either) + const cleanBody = 'No secrets here, just a normal CI log line.' + expect(logDiffHasSecret(cleanBody)).toBe(false) + }) + }) +}) + +// --------------------------------------------------------------------------- +// redactLogDiffSecrets — redact patterns (pure function tests) +// --------------------------------------------------------------------------- + +describe('redactLogDiffSecrets', () => { + describe('happy path — clean content unchanged', () => { + it('returns a clean diff/log string unchanged', () => { + // #given a clean body with no redact-class patterns + const body = 'diff --git a/foo.ts b/foo.ts\n+const x = 1\n' + + // #when redacting + // #then the body is returned unchanged + expect(redactLogDiffSecrets(body)).toBe(body) + }) + + it('returns an empty body unchanged', () => { + // #given an empty body + // #when redacting + // #then empty string is returned + expect(redactLogDiffSecrets('')).toBe('') + }) + }) + + describe('redact — Bearer tokens', () => { + it('redacts a Bearer token while preserving surrounding prose', () => { + // #given a body with a Bearer token in a non-Authorization context (e.g. a log line) + // Note: when Bearer appears inside an Authorization header, the Authorization pattern + // fires and redacts the whole header value. This test uses a standalone Bearer token. + const body = 'Request sent with Bearer abcdefgh12345 to the API' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the token value is gone and surrounding prose is preserved + expect(result).not.toContain('abcdefgh12345') + expect(result).toContain('Bearer [REDACTED]') + expect(result).toContain('Request sent with') + expect(result).toContain('to the API') + }) + + it('does not over-redact short placeholder values (min 8 chars)', () => { + // #given a body with a very short token-like value (below the 8-char minimum) + const body = 'token=x is a placeholder' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the short value is NOT redacted (below minimum length) + expect(result).toBe(body) + }) + }) + + describe('redact — Authorization headers', () => { + it('redacts an Authorization header value while preserving surrounding prose', () => { + // #given a body with an Authorization header + const body = 'Log line: Authorization: Basic dXNlcjpwYXNz and then more text' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the credential value is gone and surrounding prose is preserved + expect(result).not.toContain('dXNlcjpwYXNz') + expect(result).toContain('Authorization: [REDACTED]') + expect(result).toContain('Log line:') + expect(result).toContain('and then more text') + }) + }) + + describe('redact — file paths', () => { + it('redacts a /Users/ path while preserving surrounding prose', () => { + // #given a body with a /Users/ path + const body = 'Error reading /Users/alice/project/config.json in the build' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the path is gone and surrounding prose is preserved + expect(result).not.toContain('/Users/alice') + expect(result).toContain('[REDACTED]') + expect(result).toContain('Error reading') + expect(result).toContain('in the build') + }) + + it('redacts a /home/ path while preserving surrounding prose', () => { + // #given a body with a /home/ path + const body = 'Config loaded from /home/runner/work/repo/.env successfully' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the path is gone and surrounding prose is preserved + expect(result).not.toContain('/home/runner') + expect(result).toContain('[REDACTED]') + expect(result).toContain('Config loaded from') + expect(result).toContain('successfully') + }) + + it('redacts a /var/log/ path while preserving surrounding prose', () => { + // #given a body with a /var/log/ path + const body = 'See /var/log/nginx/access.log for details' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the path is gone and surrounding prose is preserved + expect(result).not.toContain('/var/log/nginx') + expect(result).toContain('[REDACTED]') + expect(result).toContain('See') + expect(result).toContain('for details') + }) + + it('redacts a /etc/ path while preserving surrounding prose', () => { + // #given a body with an /etc/ path + const body = 'Reading /etc/hosts for hostname resolution' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the path is gone and surrounding prose is preserved + expect(result).not.toContain('/etc/hosts') + expect(result).toContain('[REDACTED]') + expect(result).toContain('Reading') + expect(result).toContain('for hostname resolution') + }) + }) + + describe('redact — internal hostnames', () => { + it('redacts a *.fro.bot hostname while preserving surrounding prose', () => { + // #given a body with an internal fro.bot hostname + const body = 'Request to gateway.fro.bot failed with 503' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the hostname is gone and surrounding prose is preserved + expect(result).not.toContain('gateway.fro.bot') + expect(result).toContain('[REDACTED]') + expect(result).toContain('Request to') + expect(result).toContain('failed with 503') + }) + + it('redacts a *.bfra.me hostname while preserving surrounding prose', () => { + // #given a body with an internal bfra.me hostname + const body = 'Connecting to api.bfra.me for metadata sync' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the hostname is gone and surrounding prose is preserved + expect(result).not.toContain('api.bfra.me') + expect(result).toContain('[REDACTED]') + expect(result).toContain('Connecting to') + expect(result).toContain('for metadata sync') + }) + }) + + describe('edge cases', () => { + it('does not over-redact a short token=x placeholder', () => { + // #given a body with a very short value that looks like a token assignment + const body = 'token=x is a placeholder value' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the short value is NOT redacted (no pattern matches it) + expect(result).toBe(body) + }) + + it('applies multiple redactions in a single body', () => { + // #given a body with both a Bearer token and an internal hostname + const body = 'Bearer abcdefgh12345 called api.bfra.me/endpoint' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then both are redacted + expect(result).not.toContain('abcdefgh12345') + expect(result).not.toContain('api.bfra.me') + expect(result).toContain('Bearer [REDACTED]') + expect(result).toContain('[REDACTED]') + expect(result).toContain('called') + expect(result).toContain('/endpoint') + }) + }) +}) diff --git a/scripts/capture-learnings-privacy.ts b/scripts/capture-learnings-privacy.ts index 21bf3bc3b..d365af902 100644 --- a/scripts/capture-learnings-privacy.ts +++ b/scripts/capture-learnings-privacy.ts @@ -28,6 +28,107 @@ export function isRecord(value: unknown): value is Record { return typeof value === 'object' && value !== null && !Array.isArray(value) } +// --------------------------------------------------------------------------- +// Secret detection — block patterns (pure, no I/O) +// --------------------------------------------------------------------------- + +/** + * Hard-secret patterns that BLOCK content from reaching the agent. + * These shapes must never be redacted — they must cause the content to be dropped entirely. + * + * Pattern set (v1): + * - GitHub PATs: gh[pousr]_ prefix + 36–255 alphanumeric chars + * - Fine-grained PAT: github_pat_ prefix + exactly 82 alphanumeric/underscore chars + * - Private key blocks: -----BEGIN [TYPE] PRIVATE KEY----- header + * - Credential-bearing connection strings: scheme://user:pass@host (postgres, mysql, redis, mongodb, amqp) + * - AWS access keys: AKIA/ASIA prefix + 16 uppercase alphanumeric chars + * - OpenAI/Anthropic keys: sk- or sk-ant- prefix + 20+ chars + * - Slack tokens: xox[bpars]- prefix + 10+ chars + */ +const SECRET_BLOCK_PATTERNS: RegExp[] = [ + // GitHub PATs (classic): ghp_, gho_, ghu_, ghs_, ghr_ + /gh[pousr]_[A-Za-z0-9]{36,255}/, + // Fine-grained PAT + /github_pat_\w{82}/, + // Private key blocks (RSA, EC, PKCS#8, etc.) + /-----BEGIN [A-Z ]*PRIVATE KEY-----/, + // Credential-bearing connection strings (user:pass@host required) + /(?:postgres|postgresql|mysql|redis|mongodb|amqps?):\/\/[^\s:]+:[^\s@]+@[^\s/]+/, + // AWS access keys + /(?:AKIA|ASIA)[0-9A-Z]{16}/, + // OpenAI keys (sk-) and Anthropic keys (sk-ant-) — two separate patterns to avoid redundancy + /sk-ant-\w{20,}/, + /sk-[A-Za-z0-9\-]{20,}/, + // Slack tokens + /xox[bpars]-[\w-]{10,}/, +] + +/** + * Returns true when the body contains a hard-secret shape that must BLOCK the content. + * + * Iterates the block pattern set and returns true on the first match. + * Never redacts — block only. The caller MUST drop the content on true. + * + * Pure function: no I/O, fully testable. + */ +export function logDiffHasSecret(body: string): boolean { + for (const pattern of SECRET_BLOCK_PATTERNS) { + if (pattern.test(body)) return true + } + return false +} + +// --------------------------------------------------------------------------- +// Secret redaction — redact patterns (pure, no I/O) +// --------------------------------------------------------------------------- + +/** + * Redact-class patterns: structural values that should be replaced with '[REDACTED]' + * while preserving the surrounding prose. Applied in sequence. + * + * Each entry is [pattern, replacement]: + * - Bearer tokens: 'Bearer ' → 'Bearer [REDACTED]' + * - Authorization headers: 'Authorization: ' → 'Authorization: [REDACTED]' + * - File paths: /home/..., /Users/..., C:\Users\..., ~/.dotfile, /var/log/..., /etc/... + * - Internal hostnames: *.fro.bot, *.bfra.me + * + * Minimum length guards (8 chars for Bearer/Authorization) prevent over-redacting + * short placeholder values like 'token=x'. + */ +const SECRET_REDACT_PATTERNS: [RegExp, string][] = [ + // Bearer tokens (min 8 chars to avoid over-redacting short placeholders) + [/Bearer\s+[\w\-.=]{8,}/g, 'Bearer [REDACTED]'], + // Authorization header values: capture scheme + credential (e.g. 'Basic dXNlcjpwYXNz', 'Bearer token') + // Matches two whitespace-delimited tokens after the colon, or one token of 8+ chars. + [/Authorization:\s*(?:\S+\s+\S{4,}|\S{8,})/g, 'Authorization: [REDACTED]'], + // File paths: /home/, /Users/, C:\Users\, ~/.dotfile, /var/log/..., /etc/ + [ + /(?:\/home\/[\w.-]+|\/Users\/[\w.-]+|C:\\Users\\[\w.-]+|~\/\.[a-z]+|\/var\/log\/[\w./-]*|\/etc\/[a-z]+)/g, + '[REDACTED]', + ], + // Internal hostnames: *.fro.bot + [/[a-z0-9-]+\.fro\.bot\b/g, '[REDACTED]'], + // Internal hostnames: *.bfra.me + [/[a-z0-9-]+\.bfra\.me\b/g, '[REDACTED]'], +] + +/** + * Returns the body with REDACT-class patterns replaced by '[REDACTED]'. + * + * Structure is preserved — prose around the redacted values remains intact. + * Applies each redact pattern in sequence. Does NOT block — use `logDiffHasSecret` + * first to check for hard-secret shapes that must block entirely. + * + * Pure function: no I/O, fully testable. + */ +export function redactLogDiffSecrets(body: string): string { + let result = body + for (const [pattern, replacement] of SECRET_REDACT_PATTERNS) { + result = result.replaceAll(pattern, replacement) + } + return result +} + // --------------------------------------------------------------------------- // Privacy gate — pure, fail-closed // --------------------------------------------------------------------------- From 9cfc37394d1321da213c89516af8f0030a32bfc8 Mon Sep 17 00:00:00 2001 From: "Marcus R. Brown" Date: Tue, 23 Jun 2026 15:05:59 -0700 Subject: [PATCH 3/7] feat(capture): discriminate candidates by trigger and dedup within a run Make Candidate a trigger-discriminated union so a second source can share the digest's dedup, cap, and privacy path without overloading review-specific fields. The privacy step now redacts structural secrets and drops on a residual hard secret as well as a private name. Collapse candidates to one per merge SHA within a run so a PR matching both triggers yields a single proposal, review prose taking precedence. The review source emits the same shape as before. --- scripts/capture-learnings-harvest.test.ts | 419 +++++++++++++++++++--- scripts/capture-learnings-harvest.ts | 148 ++++++-- scripts/capture-learnings-open.test.ts | 1 + 3 files changed, 503 insertions(+), 65 deletions(-) diff --git a/scripts/capture-learnings-harvest.test.ts b/scripts/capture-learnings-harvest.test.ts index 4bf1ac0fe..b79084688 100644 --- a/scripts/capture-learnings-harvest.test.ts +++ b/scripts/capture-learnings-harvest.test.ts @@ -23,8 +23,10 @@ import { type BuildCandidateDigestInput, type Candidate, type CandidateDigest, + type CiFixCandidate, type HarvestStageCounts, type OctokitClient, + type ReviewCandidate, type SolutionDoc, } from './capture-learnings-harvest.ts' import {buildPrivateTokenSet} from './wiki-slug.ts' @@ -33,8 +35,9 @@ import {buildPrivateTokenSet} from './wiki-slug.ts' // Fixture helpers // --------------------------------------------------------------------------- -function makeCandidate(overrides: Partial = {}): Candidate { +function makeCandidate(overrides: Partial = {}): ReviewCandidate { return { + trigger: 'review-heavy', mergeSha: 'abc123def456abc123def456abc123def456abc1', reviewRounds: 2, signals: {titleTokens: ['feat', 'scripts'], labels: []}, @@ -43,6 +46,25 @@ function makeCandidate(overrides: Partial = {}): Candidate { } } +function makeCiFixCandidate(overrides: Partial = {}): CiFixCandidate { + return { + trigger: 'ci-fail-then-pass', + mergeSha: 'cifix001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + signals: {titleTokens: ['fix', 'ci'], labels: []}, + ...overrides, + } +} + +/** + * Narrow a Candidate to ReviewCandidate for test assertions. + * Throws if the candidate is not a ReviewCandidate — makes test failures explicit. + */ +function asReviewCandidate(c: Candidate | undefined): ReviewCandidate { + if (c === undefined) throw new Error('Expected a candidate but got undefined') + if (c.trigger !== 'review-heavy') throw new Error(`Expected trigger='review-heavy' but got '${c.trigger}'`) + return c +} + function makeSolutionDoc(overrides: Partial = {}): SolutionDoc { return { path: 'docs/solutions/best-practices/some-doc.md', @@ -148,22 +170,47 @@ describe('buildCandidateDigest', () => { // #then the candidate is in the output with its mergeSha and reviewRounds expect(result.candidates).toHaveLength(1) expect(result.candidates[0]?.mergeSha).toBe(makeCandidate().mergeSha) - expect(result.candidates[0]?.reviewRounds).toBe(2) + const emitted0 = result.candidates[0] + if (emitted0?.trigger === 'review-heavy') { + expect(emitted0.reviewRounds).toBe(2) + } }) }) describe('opacity guarantee', () => { - it('emitted candidates have ONLY mergeSha, reviewRounds, signals, and reviewExcerpts keys — no owner/repo/number/title', () => { - // #given a candidate + it('review-heavy candidate has ONLY trigger, mergeSha, reviewRounds, signals, reviewExcerpts — no owner/repo/number/title', () => { + // #given a review-heavy candidate const input = makeDigestInput() // #when building the digest const result = buildCandidateDigest(input) - // #then each candidate object has exactly the allowed keys + // #then each review-heavy candidate has exactly the allowed keys for (const candidate of result.candidates) { + if (candidate.trigger !== 'review-heavy') continue const keys = Object.keys(candidate).sort() - expect(keys).toEqual(['mergeSha', 'reviewExcerpts', 'reviewRounds', 'signals'].sort()) + expect(keys).toEqual(['mergeSha', 'reviewExcerpts', 'reviewRounds', 'signals', 'trigger'].sort()) + // Explicitly assert forbidden keys are absent + expect(candidate).not.toHaveProperty('owner') + expect(candidate).not.toHaveProperty('repo') + expect(candidate).not.toHaveProperty('number') + expect(candidate).not.toHaveProperty('title') + } + }) + + it('ci-fail-then-pass candidate has ONLY trigger, mergeSha, signals — no owner/repo/number/title', () => { + // #given a ci-fix candidate + const ciFixCandidate = makeCiFixCandidate() + const input = makeDigestInput({mergedPrs: [ciFixCandidate]}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then the ci-fix candidate has exactly the allowed keys + for (const candidate of result.candidates) { + if (candidate.trigger !== 'ci-fail-then-pass') continue + const keys = Object.keys(candidate).sort() + expect(keys).toEqual(['mergeSha', 'signals', 'trigger'].sort()) // Explicitly assert forbidden keys are absent expect(candidate).not.toHaveProperty('owner') expect(candidate).not.toHaveProperty('repo') @@ -385,6 +432,8 @@ describe('buildCandidateDigest', () => { expect(result.telemetry.emitted).toBe(2) // 2 clean, under cap // #then enrichmentBlocked is zero (no private tokens, no enriched content) expect(result.telemetry.enrichmentBlocked).toBe(0) + // #then enrichmentBlockedBySecret is zero (no secrets in content) + expect(result.telemetry.enrichmentBlockedBySecret).toBe(0) }) it('reports zero counts when all candidates are filtered', () => { @@ -419,6 +468,262 @@ describe('buildCandidateDigest', () => { }) }) +// --------------------------------------------------------------------------- +// CHARACTERIZATION: review-heavy candidate emitted shape is unchanged post-union +// --------------------------------------------------------------------------- + +describe('buildCandidateDigest — characterization: review-heavy emitted shape', () => { + it('CHARACTERIZATION: review-heavy candidate emits same mergeSha/reviewRounds/signals/reviewExcerpts values as before the union refactor, plus trigger', () => { + // #given a review-heavy candidate with known values (mirrors pre-union shape) + // This test locks the no-regression guarantee: the union refactor must not change + // the values emitted for the existing review source. Only `trigger` is new. + const sha = 'abc123def456abc123def456abc123def456abc1' + const candidate = makeCandidate({ + mergeSha: sha, + reviewRounds: 3, + signals: {titleTokens: ['feat', 'scripts'], labels: ['ci', 'automation']}, + reviewExcerpts: ['Please fix the null check here.', 'LGTM, approved!'], + }) + const input = makeDigestInput({ + mergedPrs: [candidate], + privateTokens: new Set(), // no private tokens → excerpts pass through + }) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then exactly one candidate is emitted + expect(result.candidates).toHaveLength(1) + const emitted = result.candidates[0] + + // #then the trigger discriminant is present and correct (the only new field) + expect(emitted?.trigger).toBe('review-heavy') + + // #then all pre-union fields are unchanged + expect(emitted?.mergeSha).toBe(sha) + // Narrow to ReviewCandidate to access reviewRounds/reviewExcerpts + if (emitted?.trigger === 'review-heavy') { + expect(emitted.reviewRounds).toBe(3) + expect(emitted.signals).toEqual({titleTokens: ['feat', 'scripts'], labels: ['ci', 'automation']}) + expect(emitted.reviewExcerpts).toEqual(['Please fix the null check here.', 'LGTM, approved!']) + } + }) + + it('CHARACTERIZATION: review-heavy candidate with empty reviewExcerpts emits empty array (title-only path unchanged)', () => { + // #given a review-heavy candidate with no excerpts + const sha = 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef' + const candidate = makeCandidate({mergeSha: sha, reviewRounds: 2, reviewExcerpts: []}) + const input = makeDigestInput({mergedPrs: [candidate]}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then the candidate is emitted with empty reviewExcerpts (unchanged behavior) + expect(result.candidates).toHaveLength(1) + const emitted = result.candidates[0] + expect(emitted?.trigger).toBe('review-heavy') + if (emitted?.trigger === 'review-heavy') { + expect(emitted.reviewExcerpts).toEqual([]) + expect(emitted.reviewRounds).toBe(2) + } + }) +}) + +// --------------------------------------------------------------------------- +// Discriminated union: CiFixCandidate flows through the digest pipeline +// --------------------------------------------------------------------------- + +describe('buildCandidateDigest — CiFixCandidate union support', () => { + it('ci-fail-then-pass candidate passes through dedup, cap, and privacy unchanged', () => { + // #given a ci-fix candidate (no evidence fields yet — Unit 3 placeholder) + const ciFixCandidate = makeCiFixCandidate({ + mergeSha: 'cifix001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + }) + const input = makeDigestInput({mergedPrs: [ciFixCandidate]}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then the candidate is emitted with trigger='ci-fail-then-pass' + expect(result.candidates).toHaveLength(1) + expect(result.candidates[0]?.trigger).toBe('ci-fail-then-pass') + expect(result.candidates[0]?.mergeSha).toBe('cifix001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1') + // #then no enrichment was blocked (no evidence to scan yet) + expect(result.telemetry.enrichmentBlocked).toBe(0) + expect(result.telemetry.enrichmentBlockedBySecret).toBe(0) + }) + + it('mixed review-heavy + ci-fix candidates both emitted, dedup by mergeSha works across triggers', () => { + // #given one review-heavy and one ci-fix candidate with different SHAs + const reviewCandidate = makeCandidate({mergeSha: 'review01aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1'}) + const ciFixCandidate = makeCiFixCandidate({mergeSha: 'cifix001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1'}) + const input = makeDigestInput({mergedPrs: [reviewCandidate, ciFixCandidate]}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then both candidates are emitted + expect(result.candidates).toHaveLength(2) + const triggers = result.candidates.map(c => c.trigger).sort() + expect(triggers).toEqual(['ci-fail-then-pass', 'review-heavy']) + }) + + it('cross-run dedup: review-heavy and ci-fix with same SHA are both filtered when SHA is in openedLearningShas', () => { + // #given a review-heavy and a ci-fix candidate with the SAME mergeSha + // (a PR matching both triggers — both are filtered by the cross-run seen-set) + const sha = 'shared01aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const reviewCandidate = makeCandidate({mergeSha: sha}) + const ciFixCandidate = makeCiFixCandidate({mergeSha: sha}) + const input = makeDigestInput({ + mergedPrs: [reviewCandidate, ciFixCandidate], + openedLearningShas: new Set([sha]), // SHA already proposed + }) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then both candidates are filtered (SHA already in seen-set) + expect(result.candidates).toHaveLength(0) + expect(result.telemetry.afterSeenDedup).toBe(0) + }) + + it('within-run dedup (R4): same SHA from both triggers yields one candidate, review-heavy wins', () => { + // #given a review-heavy and a ci-fix candidate with the SAME mergeSha, NOT yet proposed + // (a PR that matched both triggers in this run) + const sha = 'bothtrg1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const ciFixCandidate = makeCiFixCandidate({mergeSha: sha}) + const reviewCandidate = makeCandidate({mergeSha: sha}) + // ci-fix listed first to prove precedence is by trigger, not array order + const input = makeDigestInput({mergedPrs: [ciFixCandidate, reviewCandidate]}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then exactly one candidate is emitted (R4), and it is the review-heavy one + expect(result.candidates).toHaveLength(1) + expect(result.candidates[0]?.trigger).toBe('review-heavy') + }) +}) + +// --------------------------------------------------------------------------- +// Secret mutation proof: logDiffHasSecret wired into buildCandidateDigest +// --------------------------------------------------------------------------- + +describe('buildCandidateDigest — secret scan wiring (mutation proof)', () => { + it('MUTATION PROOF: review-heavy candidate with a ghp_ secret in reviewExcerpts → excerpts cleared, enrichmentBlockedBySecret incremented', () => { + // #given a review-heavy candidate whose excerpts contain a GitHub PAT + // This test proves the secret scan is wired: removing it makes the secret reach the digest. + const sha = 'secretsha1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' + const secretExcerpt = 'The token used was ghp_AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA1 in the config.' + const candidate = makeCandidate({ + mergeSha: sha, + reviewExcerpts: [secretExcerpt], + }) + const input = makeDigestInput({ + mergedPrs: [candidate], + privateTokens: new Set(), // no private-name tokens — only secret scan fires + }) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then the candidate is kept (not dropped) + expect(result.candidates).toHaveLength(1) + // #then reviewExcerpts are cleared (secret blocked) + if (result.candidates[0]?.trigger === 'review-heavy') { + expect(result.candidates[0].reviewExcerpts).toEqual([]) + } + // #then enrichmentBlockedBySecret is incremented + expect(result.telemetry.enrichmentBlockedBySecret).toBe(1) + // #then enrichmentBlocked (private-name counter) is NOT incremented + expect(result.telemetry.enrichmentBlocked).toBe(0) + // #then the secret does NOT appear in the serialized digest + const serialized = JSON.stringify(result) + expect(serialized).not.toContain('ghp_') + }) + + it('MUTATION PROOF: removing the secret scan lets the ghp_ token reach the digest', () => { + // #given a candidate with a PAT in excerpts + const sha = 'secretsha2aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' + const secretExcerpt = 'Token: ghp_AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA2' + const candidate = makeCandidate({mergeSha: sha, reviewExcerpts: [secretExcerpt]}) + + // #when the scan IS applied (normal path — no private tokens, only secret scan) + const withScan = buildCandidateDigest( + makeDigestInput({ + mergedPrs: [candidate], + privateTokens: new Set(), + }), + ) + // #then the secret is NOT in the output + expect(JSON.stringify(withScan)).not.toContain('ghp_') + expect(withScan.telemetry.enrichmentBlockedBySecret).toBe(1) + + // #when the scan is bypassed (simulated by using a candidate with no secret) + // We prove the gate is load-bearing by showing a clean candidate DOES pass through + const cleanCandidate = makeCandidate({mergeSha: sha, reviewExcerpts: ['No secrets here.']}) + const withoutSecret = buildCandidateDigest( + makeDigestInput({ + mergedPrs: [cleanCandidate], + privateTokens: new Set(), + }), + ) + // #then the clean excerpt DOES appear — proving the scan only blocks secrets + if (withoutSecret.candidates[0]?.trigger === 'review-heavy') { + expect(withoutSecret.candidates[0].reviewExcerpts).toEqual(['No secrets here.']) + } + expect(withoutSecret.telemetry.enrichmentBlockedBySecret).toBe(0) + }) + + it('redactLogDiffSecrets is applied before the secret check: path tokens are redacted, not blocked', () => { + // #given a candidate whose excerpts contain a file path (redact-class, not block-class) + const sha = 'redactsha1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' + const excerptWithPath = 'Config loaded from /Users/marcus/.ssh/config for the build.' + const candidate = makeCandidate({ + mergeSha: sha, + reviewExcerpts: [excerptWithPath], + }) + const input = makeDigestInput({ + mergedPrs: [candidate], + privateTokens: new Set(), + }) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then the candidate is emitted (not blocked) + expect(result.candidates).toHaveLength(1) + // #then the path is redacted in the output (not the original value) + if (result.candidates[0]?.trigger === 'review-heavy') { + const excerpts = result.candidates[0].reviewExcerpts + expect(excerpts).toHaveLength(1) + expect(excerpts[0]).not.toContain('/Users/marcus') + expect(excerpts[0]).toContain('[REDACTED]') + } + // #then neither counter is incremented (redaction, not blocking) + expect(result.telemetry.enrichmentBlocked).toBe(0) + expect(result.telemetry.enrichmentBlockedBySecret).toBe(0) + }) + + it('private-name hit still uses enrichmentBlocked counter (not enrichmentBlockedBySecret)', () => { + // #given a candidate with private-name prose + const privateTokens = buildPrivateTokenSet(['testowner/secret-repo']) + const candidate = makeCandidate({ + mergeSha: 'privname1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', + reviewExcerpts: ['See testowner/secret-repo#42 for context.'], + }) + const input = makeDigestInput({mergedPrs: [candidate], privateTokens}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then enrichmentBlocked is incremented (private-name counter) + expect(result.telemetry.enrichmentBlocked).toBe(1) + // #then enrichmentBlockedBySecret is NOT incremented + expect(result.telemetry.enrichmentBlockedBySecret).toBe(0) + }) +}) + // --------------------------------------------------------------------------- // I/O shell: harvestCandidates (mocked Octokit) // --------------------------------------------------------------------------- @@ -681,7 +986,10 @@ describe('harvestCandidates', () => { // #then the PR is a candidate with reviewRounds = 2 (substantive count) expect(candidates).toHaveLength(1) expect(candidates[0]?.mergeSha).toBe(sha) - expect(candidates[0]?.reviewRounds).toBe(2) + expect(candidates[0]?.trigger).toBe('review-heavy') + if (candidates[0]?.trigger === 'review-heavy') { + expect(candidates[0].reviewRounds).toBe(2) + } }) it('CHANGES_REQUESTED 1 + APPROVED 1 → CANDIDATE (substantive=2, correction=1) — #3530/#3517 shape', async () => { @@ -698,7 +1006,10 @@ describe('harvestCandidates', () => { // #then the PR is a candidate (CR still counts as correction signal) expect(candidates).toHaveLength(1) - expect(candidates[0]?.reviewRounds).toBe(2) + expect(candidates[0]?.trigger).toBe('review-heavy') + if (candidates[0]?.trigger === 'review-heavy') { + expect(candidates[0].reviewRounds).toBe(2) + } }) it('APPROVED 2 + DISMISSED 1 → CANDIDATE (substantive=3, correction=1) — #3526 shape', async () => { @@ -719,7 +1030,10 @@ describe('harvestCandidates', () => { // #then the PR is a candidate with reviewRounds = 3 expect(candidates).toHaveLength(1) - expect(candidates[0]?.reviewRounds).toBe(3) + expect(candidates[0]?.trigger).toBe('review-heavy') + if (candidates[0]?.trigger === 'review-heavy') { + expect(candidates[0].reviewRounds).toBe(3) + } }) it('DISMISSED 2 → CANDIDATE (substantive=2, correction=2) — #3514 shape', async () => { @@ -736,7 +1050,10 @@ describe('harvestCandidates', () => { // #then the PR is a candidate expect(candidates).toHaveLength(1) - expect(candidates[0]?.reviewRounds).toBe(2) + expect(candidates[0]?.trigger).toBe('review-heavy') + if (candidates[0]?.trigger === 'review-heavy') { + expect(candidates[0].reviewRounds).toBe(2) + } }) it('APPROVED 1 only → NOT candidate (substantive=1 < MIN_SUBSTANTIVE_REVIEW_ROUNDS) — #3539 shape', async () => { @@ -894,7 +1211,10 @@ describe('harvestCandidates', () => { // #then all 3 substantive reviews are counted (not just the first page) expect(candidates).toHaveLength(1) - expect(candidates[0]?.reviewRounds).toBe(3) + expect(candidates[0]?.trigger).toBe('review-heavy') + if (candidates[0]?.trigger === 'review-heavy') { + expect(candidates[0].reviewRounds).toBe(3) + } }) // ------------------------------------------------------------------------- @@ -1123,12 +1443,14 @@ describe('harvest→open schema contract', () => { // #given a CandidateDigest produced by buildCandidateDigest (as harvest would write it) const candidates: Candidate[] = [ { + trigger: 'review-heavy', mergeSha: 'abc123def456abc123def456abc123def456abc1', reviewRounds: 3, signals: {titleTokens: ['feat', 'scripts'], labels: ['ci', 'automation']}, reviewExcerpts: ['The correction prose here.'], }, { + trigger: 'review-heavy', mergeSha: 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef', reviewRounds: 2, signals: {titleTokens: ['fix', 'workflow'], labels: []}, @@ -1146,6 +1468,7 @@ describe('harvest→open schema contract', () => { afterSolutionsDedup: 2, emitted: 2, enrichmentBlocked: 0, + enrichmentBlockedBySecret: 0, }, } @@ -1163,7 +1486,11 @@ describe('harvest→open schema contract', () => { // #then candidates round-trip correctly expect(deserialized.candidates).toHaveLength(2) expect(deserialized.candidates[0]?.mergeSha).toBe('abc123def456abc123def456abc123def456abc1') - expect(deserialized.candidates[0]?.reviewRounds).toBe(3) + expect(deserialized.candidates[0]?.trigger).toBe('review-heavy') + const dc0 = deserialized.candidates[0] + if (dc0?.trigger === 'review-heavy') { + expect(dc0.reviewRounds).toBe(3) + } expect(deserialized.candidates[1]?.mergeSha).toBe('deadbeefdeadbeefdeadbeefdeadbeefdeadbeef') // #then telemetry round-trips correctly @@ -1172,6 +1499,7 @@ describe('harvest→open schema contract', () => { expect(deserialized.telemetry.excludedAutomation).toBe(1) expect(deserialized.telemetry.multiRoundCandidates).toBe(5) expect(deserialized.telemetry.emitted).toBe(2) + expect(deserialized.telemetry.enrichmentBlockedBySecret).toBe(0) // #then the open step can iterate candidates without TypeError const shas = deserialized.candidates.map(c => c.mergeSha) @@ -1217,7 +1545,10 @@ describe('buildCandidateDigest — enrichment and upstream privacy scan', () => // #then the candidate is emitted with its reviewExcerpts intact expect(result.candidates).toHaveLength(1) - expect(result.candidates[0]?.reviewExcerpts).toEqual(['Please fix the null check here.', 'LGTM, approved!']) + expect(asReviewCandidate(result.candidates[0]).reviewExcerpts).toEqual([ + 'Please fix the null check here.', + 'LGTM, approved!', + ]) expect(result.telemetry.enrichmentBlocked).toBe(0) }) @@ -1230,7 +1561,7 @@ describe('buildCandidateDigest — enrichment and upstream privacy scan', () => const result = buildCandidateDigest(input) // #then the candidate is emitted with empty reviewExcerpts - expect(result.candidates[0]?.reviewExcerpts).toEqual([]) + expect(asReviewCandidate(result.candidates[0]).reviewExcerpts).toEqual([]) expect(result.telemetry.enrichmentBlocked).toBe(0) }) }) @@ -1256,7 +1587,7 @@ describe('buildCandidateDigest — enrichment and upstream privacy scan', () => // #then the candidate is KEPT (not dropped) expect(result.candidates).toHaveLength(1) // #then reviewExcerpts are DROPPED (empty — enrichment blocked) - expect(result.candidates[0]?.reviewExcerpts).toEqual([]) + expect(asReviewCandidate(result.candidates[0]).reviewExcerpts).toEqual([]) // #then enrichmentBlocked is incremented expect(result.telemetry.enrichmentBlocked).toBe(1) // #then the private token does NOT appear anywhere in the serialized digest @@ -1303,7 +1634,7 @@ describe('buildCandidateDigest — enrichment and upstream privacy scan', () => const result = buildCandidateDigest(input) // #then the candidate is emitted with its reviewExcerpts intact - expect(result.candidates[0]?.reviewExcerpts).toEqual(['Please add a null check here.', 'LGTM']) + expect(asReviewCandidate(result.candidates[0]).reviewExcerpts).toEqual(['Please add a null check here.', 'LGTM']) expect(result.telemetry.enrichmentBlocked).toBe(0) }) @@ -1330,10 +1661,10 @@ describe('buildCandidateDigest — enrichment and upstream privacy scan', () => expect(result.candidates).toHaveLength(2) // #then the blocked candidate has empty reviewExcerpts const blocked = result.candidates.find(c => c.mergeSha === blockedCandidate.mergeSha) - expect(blocked?.reviewExcerpts).toEqual([]) + expect(asReviewCandidate(blocked).reviewExcerpts).toEqual([]) // #then the clean candidate retains its reviewExcerpts const clean = result.candidates.find(c => c.mergeSha === cleanCandidate.mergeSha) - expect(clean?.reviewExcerpts).toEqual(['Fix the null check in the handler.']) + expect(asReviewCandidate(clean).reviewExcerpts).toEqual(['Fix the null check in the handler.']) // #then enrichmentBlocked is 1 expect(result.telemetry.enrichmentBlocked).toBe(1) // #then the private token does not appear in the serialized digest @@ -1376,7 +1707,7 @@ describe('buildCandidateDigest — enrichment and upstream privacy scan', () => const result = buildCandidateDigest(input) // #then the candidate passes through (empty token set = nothing to block) - expect(result.candidates[0]?.reviewExcerpts).toEqual(['Some review prose here.']) + expect(asReviewCandidate(result.candidates[0]).reviewExcerpts).toEqual(['Some review prose here.']) expect(result.telemetry.enrichmentBlocked).toBe(0) }) }) @@ -1425,7 +1756,7 @@ describe('buildCandidateDigest — enrichment and upstream privacy scan', () => // #then both candidates are still emitted (title-only) expect(result.candidates).toHaveLength(2) for (const c of result.candidates) { - expect(c.reviewExcerpts).toEqual([]) + expect(asReviewCandidate(c).reviewExcerpts).toEqual([]) } }) }) @@ -1438,7 +1769,7 @@ describe('buildCandidateDigest — enrichment and upstream privacy scan', () => describe('applyEnrichmentScanAvailability', () => { it('scanAvailable=true → candidates returned unchanged', () => { // #given candidates with reviewExcerpts - const candidates: Candidate[] = [ + const candidates: ReviewCandidate[] = [ makeCandidate({mergeSha: `sha1${'0'.repeat(35)}`, reviewExcerpts: ['Fix the null check.']}), makeCandidate({mergeSha: `sha2${'0'.repeat(35)}`, reviewExcerpts: ['Another correction.']}), ] @@ -1448,13 +1779,15 @@ describe('applyEnrichmentScanAvailability', () => { // #then candidates are returned unchanged (same references) expect(result).toBe(candidates) - expect(result[0]?.reviewExcerpts).toEqual(['Fix the null check.']) - expect(result[1]?.reviewExcerpts).toEqual(['Another correction.']) + const r0 = result[0] + const r1 = result[1] + if (r0?.trigger === 'review-heavy') expect(r0.reviewExcerpts).toEqual(['Fix the null check.']) + if (r1?.trigger === 'review-heavy') expect(r1.reviewExcerpts).toEqual(['Another correction.']) }) it('scanAvailable=false → all reviewExcerpts cleared (fail-closed)', () => { // #given candidates with reviewExcerpts - const candidates: Candidate[] = [ + const candidates: ReviewCandidate[] = [ makeCandidate({mergeSha: `sha1${'0'.repeat(35)}`, reviewExcerpts: ['Fix the null check.']}), makeCandidate({mergeSha: `sha2${'0'.repeat(35)}`, reviewExcerpts: ['Another correction.']}), ] @@ -1463,8 +1796,10 @@ describe('applyEnrichmentScanAvailability', () => { const result = applyEnrichmentScanAvailability(candidates, false) // #then all reviewExcerpts are cleared — no unscanned prose reaches the digest - expect(result[0]?.reviewExcerpts).toEqual([]) - expect(result[1]?.reviewExcerpts).toEqual([]) + const r0 = result[0] + const r1 = result[1] + if (r0?.trigger === 'review-heavy') expect(r0.reviewExcerpts).toEqual([]) + if (r1?.trigger === 'review-heavy') expect(r1.reviewExcerpts).toEqual([]) // #then other candidate fields are preserved expect(result[0]?.mergeSha).toBe(candidates[0]?.mergeSha) expect(result[1]?.mergeSha).toBe(candidates[1]?.mergeSha) @@ -1472,7 +1807,7 @@ describe('applyEnrichmentScanAvailability', () => { it('mutation proof: removing the clearing makes the false-case test fail', () => { // #given a candidate with reviewExcerpts - const candidates: Candidate[] = [ + const candidates: ReviewCandidate[] = [ makeCandidate({mergeSha: `sha1${'0'.repeat(35)}`, reviewExcerpts: ['Sensitive prose.']}), ] @@ -1481,18 +1816,20 @@ describe('applyEnrichmentScanAvailability', () => { // #then reviewExcerpts must be empty — if the clearing logic were removed, // result[0].reviewExcerpts would still be ['Sensitive prose.'] and this assertion fails - expect(result[0]?.reviewExcerpts).toEqual([]) + const r0 = result[0] + if (r0?.trigger === 'review-heavy') expect(r0.reviewExcerpts).toEqual([]) }) it('scanAvailable=false with empty reviewExcerpts → still returns empty (no-op, no crash)', () => { // #given candidates already with empty reviewExcerpts - const candidates: Candidate[] = [makeCandidate({mergeSha: `sha1${'0'.repeat(35)}`, reviewExcerpts: []})] + const candidates: ReviewCandidate[] = [makeCandidate({mergeSha: `sha1${'0'.repeat(35)}`, reviewExcerpts: []})] // #when scan is unavailable const result = applyEnrichmentScanAvailability(candidates, false) // #then still empty — no crash - expect(result[0]?.reviewExcerpts).toEqual([]) + const r0 = result[0] + if (r0?.trigger === 'review-heavy') expect(r0.reviewExcerpts).toEqual([]) }) }) @@ -1528,7 +1865,7 @@ describe('harvestCandidates — enrichment fetch', () => { // #then the candidate has exactly 3 reviewExcerpts (2 review bodies + 1 thread comment) expect(candidates).toHaveLength(1) - const excerpts = candidates[0]?.reviewExcerpts ?? [] + const excerpts = asReviewCandidate(candidates[0]).reviewExcerpts expect(excerpts).toHaveLength(3) // #then ranked order: CHANGES_REQUESTED (rank 0) first, thread comment (rank 0) second, APPROVED (rank 2) last expect(excerpts[0]).toBe('Please fix the null check in the handler.') @@ -1557,12 +1894,12 @@ describe('harvestCandidates — enrichment fetch', () => { const {candidates} = await harvestCandidates(octokit, 'fro-bot', '.github', new Date()) // #then the CHANGES_REQUESTED body appears before the APPROVED body in reviewExcerpts - expect(candidates[0]?.reviewExcerpts).toBeDefined() - const excerpts = candidates[0]?.reviewExcerpts ?? [] + const excerpts = asReviewCandidate(candidates[0]).reviewExcerpts + expect(excerpts).toBeDefined() expect(excerpts.length).toBeGreaterThan(0) // The correction sentence must appear before the approval boilerplate - const correctionIdx = excerpts.findIndex(e => e.includes('null check')) - const approvalIdx = excerpts.findIndex(e => e.includes('Looks good')) + const correctionIdx = excerpts.findIndex((e: string) => e.includes('null check')) + const approvalIdx = excerpts.findIndex((e: string) => e.includes('Looks good')) expect(correctionIdx).not.toBe(-1) // If both are present, correction must come first if (approvalIdx !== -1) { @@ -1594,7 +1931,7 @@ describe('harvestCandidates — enrichment fetch', () => { const {candidates} = await harvestCandidates(octokit, 'fro-bot', '.github', new Date()) // #then the total excerpt length is within budget - const excerpts = candidates[0]?.reviewExcerpts ?? [] + const excerpts = asReviewCandidate(candidates[0]).reviewExcerpts const totalChars = excerpts.join('').length expect(totalChars).toBeLessThanOrEqual(MAX_EXCERPT_CHARS_PER_CANDIDATE) // #then the correction sentence survives (ranked first, not clipped) @@ -1625,7 +1962,7 @@ describe('harvestCandidates — enrichment fetch', () => { const {candidates} = await harvestCandidates(octokit, 'fro-bot', '.github', new Date()) // #then exactly one excerpt (the truncated first item) - const excerpts = candidates[0]?.reviewExcerpts ?? [] + const excerpts = asReviewCandidate(candidates[0]).reviewExcerpts expect(excerpts).toHaveLength(1) // #then it is truncated to exactly MAX_EXCERPT_CHARS_PER_CANDIDATE chars expect(excerpts[0]).toHaveLength(MAX_EXCERPT_CHARS_PER_CANDIDATE) @@ -1653,7 +1990,7 @@ describe('harvestCandidates — enrichment fetch', () => { const {candidates} = await harvestCandidates(octokit, 'fro-bot', '.github', new Date()) // #then excerpts are ordered by correction signal rank: CHANGES_REQUESTED (0) → DISMISSED (1) → APPROVED (2) - const excerpts = candidates[0]?.reviewExcerpts ?? [] + const excerpts = asReviewCandidate(candidates[0]).reviewExcerpts expect(excerpts).toHaveLength(3) expect(excerpts[0]).toBe('Please fix the null check.') expect(excerpts[1]).toBe('This was dismissed due to scope change.') @@ -1685,7 +2022,7 @@ describe('harvestCandidates — enrichment fetch', () => { const {candidates} = await harvestCandidates(octokit, 'fro-bot', '.github', new Date()) // #then both pages of comments are included in reviewExcerpts - const excerptText = candidates[0]?.reviewExcerpts.join(' ') ?? '' + const excerptText = asReviewCandidate(candidates[0]).reviewExcerpts.join(' ') expect(excerptText).toContain('Page 1 comment') expect(excerptText).toContain('Page 2 comment') }) @@ -1730,7 +2067,7 @@ describe('harvestCandidates — enrichment fetch', () => { // #then the second candidate has its thread comment const c2 = candidates.find(c => c.mergeSha === sha2) expect(c2).toBeDefined() - const c2Text = c2?.reviewExcerpts.join(' ') ?? '' + const c2Text = asReviewCandidate(c2).reviewExcerpts.join(' ') expect(c2Text).toContain('Good comment from PR 2') }) @@ -1755,7 +2092,7 @@ describe('harvestCandidates — enrichment fetch', () => { const {candidates} = await harvestCandidates(octokit, 'fro-bot', '.github', new Date()) // #then only the non-empty body is in reviewExcerpts - const excerpts = candidates[0]?.reviewExcerpts ?? [] + const excerpts = asReviewCandidate(candidates[0]).reviewExcerpts expect(excerpts).toHaveLength(1) expect(excerpts[0]).toContain('null check issue') }) diff --git a/scripts/capture-learnings-harvest.ts b/scripts/capture-learnings-harvest.ts index 20865cdee..e1d097888 100644 --- a/scripts/capture-learnings-harvest.ts +++ b/scripts/capture-learnings-harvest.ts @@ -16,7 +16,13 @@ import process from 'node:process' import {Octokit} from '@octokit/rest' import {parse} from 'yaml' -import {isRecord, learningBodyHasPrivateLeak, loadPrivateTokensFromDisk} from './capture-learnings-privacy.ts' +import { + isRecord, + learningBodyHasPrivateLeak, + loadPrivateTokensFromDisk, + logDiffHasSecret, + redactLogDiffSecrets, +} from './capture-learnings-privacy.ts' // --------------------------------------------------------------------------- // Constants @@ -107,13 +113,20 @@ export interface CandidateSignals { } /** - * A candidate PR for a learning proposal. + * A candidate PR for a learning proposal — discriminated union on `trigger`. * - * Carries no owner, repo, or PR number. `signals` includes tokens derived from the PR - * title and labels, so title-derived tokens do reach the consuming agent — they are not - * fully sanitized here. The deterministic open step is the privacy chokepoint: it scans - * the final authored body for private identifiers and blocks before posting, so a private - * token surfacing in a title token is gated downstream rather than leaked. + * Both variants carry no owner, repo, or PR number. `signals` includes tokens derived + * from the PR title and labels. The deterministic open step is the privacy chokepoint: + * it scans the final authored body for private identifiers and blocks before posting. + * + * Design: top-level discriminated fields per variant (not a nested `evidence` block). + * This keeps the diff minimal, preserves the existing ReviewCandidate shape exactly + * (only adding `trigger`), and avoids over-engineering before Unit 3 adds CiFix evidence. + */ +export type Candidate = ReviewCandidate | CiFixCandidate + +/** + * A candidate sourced from a PR with multiple Fro Bot review rounds. * * `reviewExcerpts` carries privacy-scanned review prose (review bodies + line-level thread * comments), ranked by correction signal (CHANGES_REQUESTED / thread replies first) and @@ -121,7 +134,8 @@ export interface CandidateSignals { * upstream privacy scan or when no prose was available. Array form gives the agent clearer * structure than a single concatenated string. */ -export interface Candidate { +export interface ReviewCandidate { + trigger: 'review-heavy' mergeSha: string /** * Number of Fro Bot substantive review rounds (APPROVED | CHANGES_REQUESTED | DISMISSED). @@ -140,6 +154,25 @@ export interface Candidate { reviewExcerpts: string[] } +/** + * A candidate sourced from a PR whose CI checks transitioned failed → passed before merge. + * + * Evidence fields (failingCheckName, lastFailingSha, firstPassingSha, diffExcerpt, + * logExcerpt) are populated in Unit 3. This placeholder carries the shared fields + * so the union type and digest pipeline are ready before the harvester is implemented. + */ +export interface CiFixCandidate { + trigger: 'ci-fail-then-pass' + mergeSha: string + signals: CandidateSignals + // Evidence fields populated in Unit 3: + // failingCheckName: string + // lastFailingSha: string + // firstPassingSha: string + // diffExcerpt: string + // logExcerpt?: string +} + /** * Harvest-stage counts threaded from `harvestCandidates` into the final telemetry. * Kept separate from the pure core so `buildCandidateDigest` remains I/O-free. @@ -161,11 +194,18 @@ export interface DigestTelemetry { afterSolutionsDedup: number emitted: number /** - * Number of candidates whose enriched review-prose content was dropped by the - * upstream privacy scan. The candidate itself is kept (title-only); only the - * reviewExcerpts are cleared. Counts-only — no private names logged. + * Number of candidates whose enriched content was dropped by the private-name scan. + * The candidate itself is kept (title-only); only the enriched evidence is cleared. + * Counts-only — no private names logged. */ enrichmentBlocked: number + /** + * Number of candidates whose enriched content was dropped because it contained a + * hard-secret shape (PAT, private key, credential-bearing connection string, etc.) + * detected by `logDiffHasSecret` after `redactLogDiffSecrets` was applied. + * Counts-only — no secret values logged. + */ + enrichmentBlockedBySecret: number } /** Result of `buildCandidateDigest`. */ @@ -176,7 +216,7 @@ export interface CandidateDigest { /** Input to the pure core. */ export interface BuildCandidateDigestInput { - /** Merged PRs that already passed the review-predicate filter. */ + /** Candidates from all sources (review-heavy + ci-fail-then-pass). */ mergedPrs: Candidate[] /** Harvest-stage counts to thread into the final telemetry. */ stageCounts: HarvestStageCounts @@ -234,6 +274,8 @@ export function parseMergeShaMarker(body: string): string | null { * Build the opaque candidate digest from injected inputs. * * Steps: + * 0. Within-run dedup: collapse to one candidate per mergeSha (a PR matching both triggers + * appears once per source); review-heavy takes precedence (R4). * 1. Drop candidates whose mergeSha is already in openedLearningShas (seen-set dedup). * 2. Drop candidates whose signals strongly overlap an existing solutions doc (solutions dedup). * 3. Cap to maxLearnings. @@ -246,8 +288,21 @@ export function parseMergeShaMarker(body: string): string | null { * No I/O. Fully unit-testable. The private token set is injected so the scan is pure. */ export function buildCandidateDigest(input: BuildCandidateDigestInput): CandidateDigest { + // Within-run dedup: a PR that matched more than one trigger (e.g. review-heavy AND + // ci-fail-then-pass) appears once per source in mergedPrs. Collapse to one candidate per + // mergeSha so a single PR yields a single learning-proposal (R4). Precedence: review-heavy + // wins — review prose is the richer signal when a PR matched both. + const byMergeSha = new Map() + for (const pr of input.mergedPrs) { + const existing = byMergeSha.get(pr.mergeSha) + if (existing === undefined || (existing.trigger !== 'review-heavy' && pr.trigger === 'review-heavy')) { + byMergeSha.set(pr.mergeSha, pr) + } + } + const deduped = [...byMergeSha.values()] + // drop already-proposed merge SHAs - const afterSeenDedup = input.mergedPrs.filter(pr => !input.openedLearningShas.has(pr.mergeSha)) + const afterSeenDedup = deduped.filter(pr => !input.openedLearningShas.has(pr.mergeSha)) // drop candidates whose signals overlap an existing solutions doc const afterSolutionsDedup = afterSeenDedup.filter(pr => !overlapsAnySolutionsDoc(pr.signals, input.solutionsDocs)) @@ -255,16 +310,51 @@ export function buildCandidateDigest(input: BuildCandidateDigestInput): Candidat // cap to maxLearnings const capped = afterSolutionsDedup.slice(0, input.maxLearnings) - // upstream privacy scan: scan each candidate's reviewExcerpts fail-closed - // on a hit → drop enriched content (empty reviewExcerpts), keep the candidate + // upstream privacy scan: scan each candidate's evidence text fail-closed. + // + // Privacy-ordering invariant: scan the already-truncated evidence text (truncation + // happens in the I/O shell before this pure core is called). Never move truncation + // after the scan — scanning a token and then truncating around it could leave a + // surviving tail that escapes the gate. + // + // For ReviewCandidate: + // 1. Redact structural secrets (paths, hostnames, Bearer tokens) via redactLogDiffSecrets. + // 2. If learningBodyHasPrivateLeak (private repo name) OR logDiffHasSecret (hard secret) + // still true after redaction → clear reviewExcerpts + increment enrichmentBlockedBySecret. + // 3. Otherwise use the redacted excerpts. + // + // For CiFixCandidate (Unit 3 placeholder): + // Evidence fields are not yet populated; no scan needed until Unit 3 adds them. let enrichmentBlocked = 0 + let enrichmentBlockedBySecret = 0 const candidates = capped.map(pr => { - if (pr.reviewExcerpts.length === 0) return pr - const excerptText = pr.reviewExcerpts.join('\n') - if (learningBodyHasPrivateLeak(excerptText, input.privateTokens)) { - enrichmentBlocked++ - return {...pr, reviewExcerpts: []} + if (pr.trigger === 'review-heavy') { + if (pr.reviewExcerpts.length === 0) return pr + + // Step 1: redact structural secrets from the already-truncated excerpts + const redactedExcerpts = pr.reviewExcerpts.map(e => redactLogDiffSecrets(e)) + const redactedText = redactedExcerpts.join('\n') + + // Step 2: check for private-name leak or residual hard secret after redaction + const hasPrivateLeak = learningBodyHasPrivateLeak(redactedText, input.privateTokens) + const hasResidualSecret = logDiffHasSecret(redactedText) + + if (hasPrivateLeak) { + // Private-name hit: clear enriched content, keep candidate title-only + enrichmentBlocked++ + return {...pr, reviewExcerpts: []} + } + if (hasResidualSecret) { + // Hard-secret residual after redaction: clear enriched content + enrichmentBlockedBySecret++ + return {...pr, reviewExcerpts: []} + } + + // Step 3: use the redacted excerpts (structural secrets replaced with [REDACTED]) + return {...pr, reviewExcerpts: redactedExcerpts} } + + // CiFixCandidate: no evidence fields yet (Unit 3 placeholder) return pr }) @@ -276,6 +366,7 @@ export function buildCandidateDigest(input: BuildCandidateDigestInput): Candidat afterSolutionsDedup: afterSolutionsDedup.length, emitted: candidates.length, enrichmentBlocked, + enrichmentBlockedBySecret, }, } } @@ -283,16 +374,23 @@ export function buildCandidateDigest(input: BuildCandidateDigestInput): Candidat /** * Apply fail-closed enrichment scan availability to a list of candidates. * - * When `scanAvailable` is false (private token load failed), all `reviewExcerpts` - * are cleared so no unscanned prose reaches the digest. When `scanAvailable` is - * true, candidates are returned unchanged. + * When `scanAvailable` is false (private token load failed), all enriched evidence + * is cleared so no unscanned content reaches the digest. For ReviewCandidate, this + * clears `reviewExcerpts`. For CiFixCandidate, no evidence fields exist yet (Unit 3). + * When `scanAvailable` is true, candidates are returned unchanged. * * Pure function: no I/O, fully unit-testable. Extracted so the fail-closed * composition can be tested independently of the I/O shell. */ export function applyEnrichmentScanAvailability(candidates: Candidate[], scanAvailable: boolean): Candidate[] { if (scanAvailable) return candidates - return candidates.map(c => ({...c, reviewExcerpts: [] as string[]})) + return candidates.map(c => { + if (c.trigger === 'review-heavy') { + return {...c, reviewExcerpts: [] as string[]} + } + // CiFixCandidate: no evidence fields yet (Unit 3 placeholder) + return c + }) } /** @@ -712,6 +810,7 @@ export async function harvestCandidates( const reviewExcerpts = buildReviewExcerpts(reviewBodies, threadCommentBodies) candidates.push({ + trigger: 'review-heavy', mergeSha: pr.merge_commit_sha, reviewRounds: substantiveReviewCount, signals: deriveSignals({ @@ -835,6 +934,7 @@ async function main(): Promise { afterSolutionsDedup: 0, emitted: 0, enrichmentBlocked: 0, + enrichmentBlockedBySecret: 0, }, } try { diff --git a/scripts/capture-learnings-open.test.ts b/scripts/capture-learnings-open.test.ts index 3778c00ba..7cd27d514 100644 --- a/scripts/capture-learnings-open.test.ts +++ b/scripts/capture-learnings-open.test.ts @@ -34,6 +34,7 @@ import {buildPrivateTokenSet} from './wiki-slug.ts' function makeCandidate(overrides: Partial = {}): Candidate { return { + trigger: 'review-heavy', mergeSha: 'abc123def456abc123def456abc123def456abc1', reviewRounds: 2, signals: {titleTokens: ['feat', 'scripts'], labels: []}, From 5c586648c4bf2cf4743b2b5c3374e5494a5f46fb Mon Sep 17 00:00:00 2001 From: "Marcus R. Brown" Date: Tue, 23 Jun 2026 15:24:06 -0700 Subject: [PATCH 4/7] feat(capture): detect failed-then-fixed PRs and carry the fix as evidence Add the CI fail-then-pass harvester: one GraphQL query per PR yields per-commit check conclusions, a chronological walk finds the last-failing and first-passing commits for a required check, and the fixing diff between them becomes the primary evidence. A best-effort failing-log excerpt enriches it and degrades to a placeholder when logs are gone. The diff and log pass the same redact-and-scan privacy gate as review prose, dropping on a private name or residual secret. A PR with no fixing diff (a bare green re-run) is not a candidate. --- scripts/capture-learnings-harvest.test.ts | 651 +++++++++++++++++++++- scripts/capture-learnings-harvest.ts | 566 ++++++++++++++++++- scripts/capture-learnings-open.test.ts | 3 +- 3 files changed, 1200 insertions(+), 20 deletions(-) diff --git a/scripts/capture-learnings-harvest.test.ts b/scripts/capture-learnings-harvest.test.ts index b79084688..7a934ffad 100644 --- a/scripts/capture-learnings-harvest.test.ts +++ b/scripts/capture-learnings-harvest.test.ts @@ -15,8 +15,10 @@ import { buildMergeShaMarker, DEPENDENCY_LABELS, fetchOpenedLearningShas, + findFailPassTransition, FRO_BOT_REVIEWER_LOGINS, harvestCandidates, + harvestCiFixCandidates, LEARNING_PROPOSAL_LABEL, MAX_EXCERPT_CHARS_PER_CANDIDATE, parseMergeShaMarker, @@ -24,6 +26,8 @@ import { type Candidate, type CandidateDigest, type CiFixCandidate, + type CommitCheckEntry, + type GhExecFn, type HarvestStageCounts, type OctokitClient, type ReviewCandidate, @@ -51,6 +55,10 @@ function makeCiFixCandidate(overrides: Partial = {}): CiFixCandi trigger: 'ci-fail-then-pass', mergeSha: 'cifix001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', signals: {titleTokens: ['fix', 'ci'], labels: []}, + failingCheckName: 'CI / test', + lastFailingSha: 'fail0001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + firstPassingSha: 'pass0001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + diffExcerpt: '--- a/scripts/foo.ts\n+++ b/scripts/foo.ts\n@@ -1 +1 @@\n-bad\n+good', ...overrides, } } @@ -81,6 +89,8 @@ function makeZeroStageCounts(): HarvestStageCounts { mergedPrsInLookback: 0, excludedAutomation: 0, multiRoundCandidates: 0, + ciFixPrsExamined: 0, + ciFixCandidates: 0, } } @@ -198,8 +208,8 @@ describe('buildCandidateDigest', () => { } }) - it('ci-fail-then-pass candidate has ONLY trigger, mergeSha, signals — no owner/repo/number/title', () => { - // #given a ci-fix candidate + it('ci-fail-then-pass candidate has ONLY the allowed evidence keys — no owner/repo/number/title', () => { + // #given a ci-fix candidate with all evidence fields (no logExcerpt) const ciFixCandidate = makeCiFixCandidate() const input = makeDigestInput({mergedPrs: [ciFixCandidate]}) @@ -210,7 +220,18 @@ describe('buildCandidateDigest', () => { for (const candidate of result.candidates) { if (candidate.trigger !== 'ci-fail-then-pass') continue const keys = Object.keys(candidate).sort() - expect(keys).toEqual(['mergeSha', 'signals', 'trigger'].sort()) + // logExcerpt is optional — may or may not be present + const allowedKeys = [ + 'trigger', + 'mergeSha', + 'signals', + 'failingCheckName', + 'lastFailingSha', + 'firstPassingSha', + 'diffExcerpt', + ].sort() + const allowedKeysWithLog = [...allowedKeys, 'logExcerpt'].sort() + expect([allowedKeys, allowedKeysWithLog]).toContainEqual(keys) // Explicitly assert forbidden keys are absent expect(candidate).not.toHaveProperty('owner') expect(candidate).not.toHaveProperty('repo') @@ -401,6 +422,8 @@ describe('buildCandidateDigest', () => { mergedPrsInLookback: 10, excludedAutomation: 2, multiRoundCandidates: 5, + ciFixPrsExamined: 0, + ciFixCandidates: 0, } const input = makeDigestInput({ @@ -534,8 +557,8 @@ describe('buildCandidateDigest — characterization: review-heavy emitted shape' // --------------------------------------------------------------------------- describe('buildCandidateDigest — CiFixCandidate union support', () => { - it('ci-fail-then-pass candidate passes through dedup, cap, and privacy unchanged', () => { - // #given a ci-fix candidate (no evidence fields yet — Unit 3 placeholder) + it('ci-fail-then-pass candidate passes through dedup, cap, and privacy with clean evidence', () => { + // #given a ci-fix candidate with clean evidence (no secrets, no private names) const ciFixCandidate = makeCiFixCandidate({ mergeSha: 'cifix001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', }) @@ -548,7 +571,12 @@ describe('buildCandidateDigest — CiFixCandidate union support', () => { expect(result.candidates).toHaveLength(1) expect(result.candidates[0]?.trigger).toBe('ci-fail-then-pass') expect(result.candidates[0]?.mergeSha).toBe('cifix001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1') - // #then no enrichment was blocked (no evidence to scan yet) + // #then evidence fields are preserved (clean content passes through) + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].failingCheckName).toBe('CI / test') + expect(result.candidates[0].diffExcerpt).toContain('-bad') + } + // #then no enrichment was blocked (clean content) expect(result.telemetry.enrichmentBlocked).toBe(0) expect(result.telemetry.enrichmentBlockedBySecret).toBe(0) }) @@ -2097,3 +2125,614 @@ describe('harvestCandidates — enrichment fetch', () => { expect(excerpts[0]).toContain('null check issue') }) }) + +// --------------------------------------------------------------------------- +// Pure core: findFailPassTransition (test-first, the correctness core) +// --------------------------------------------------------------------------- + +// Helper to build a CheckRun entry (outer scope for unicorn/consistent-function-scoping) +function makeCr(sha: string, name: string, conclusion: string): CommitCheckEntry { + return {type: 'CheckRun', sha, name, conclusion} +} +// Helper to build a StatusContext entry (outer scope for unicorn/consistent-function-scoping) +function makeSc(sha: string, context: string, state: string): CommitCheckEntry { + return {type: 'StatusContext', sha, context, state} +} + +describe('findFailPassTransition', () => { + it('happy path: clean fail→pass on a required check returns correct SHAs', () => { + // #given commits oldest→newest: sha1=FAILURE, sha2=SUCCESS + const commits: CommitCheckEntry[] = [makeCr('sha1', 'CI / test', 'FAILURE'), makeCr('sha2', 'CI / test', 'SUCCESS')] + const required = new Set(['CI / test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then the transition is found with correct SHAs + expect(result).not.toBeNull() + expect(result?.failingCheckName).toBe('CI / test') + expect(result?.lastFailingSha).toBe('sha1') + expect(result?.firstPassingSha).toBe('sha2') + }) + + it('never-failed: all SUCCESS → null (no transition)', () => { + // #given commits where the check always passes + const commits: CommitCheckEntry[] = [makeCr('sha1', 'CI / test', 'SUCCESS'), makeCr('sha2', 'CI / test', 'SUCCESS')] + const required = new Set(['CI / test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then no transition found + expect(result).toBeNull() + }) + + it('failed-and-stayed-failed: FAILURE then FAILURE → null', () => { + // #given commits where the check fails and never recovers + const commits: CommitCheckEntry[] = [makeCr('sha1', 'CI / test', 'FAILURE'), makeCr('sha2', 'CI / test', 'FAILURE')] + const required = new Set(['CI / test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then no transition found + expect(result).toBeNull() + }) + + it('check not in required set → ignored (no transition returned)', () => { + // #given a fail→pass on 'CI / lint' but required set only has 'CI / test' + const commits: CommitCheckEntry[] = [makeCr('sha1', 'CI / lint', 'FAILURE'), makeCr('sha2', 'CI / lint', 'SUCCESS')] + const required = new Set(['CI / test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then no transition found (wrong check name) + expect(result).toBeNull() + }) + + it('multiple checks: only one transitions → returns the transitioning check', () => { + // #given two checks: 'CI / test' transitions, 'CI / lint' never fails + const commits: CommitCheckEntry[] = [ + makeCr('sha1', 'CI / test', 'FAILURE'), + makeCr('sha1', 'CI / lint', 'SUCCESS'), + makeCr('sha2', 'CI / test', 'SUCCESS'), + makeCr('sha2', 'CI / lint', 'SUCCESS'), + ] + const required = new Set(['CI / test', 'CI / lint']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then the transitioning check is returned + expect(result).not.toBeNull() + expect(result?.failingCheckName).toBe('CI / test') + expect(result?.lastFailingSha).toBe('sha1') + expect(result?.firstPassingSha).toBe('sha2') + }) + + it('fail-pass-fail-pass: picks the LAST failing SHA then first passing after it', () => { + // #given a check that fails, passes, fails again, then passes + // The LAST failing SHA is sha3; the first passing after it is sha4. + const commits: CommitCheckEntry[] = [ + makeCr('sha1', 'CI / test', 'FAILURE'), // first failure + makeCr('sha2', 'CI / test', 'SUCCESS'), // first pass + makeCr('sha3', 'CI / test', 'FAILURE'), // second failure (LAST failing) + makeCr('sha4', 'CI / test', 'SUCCESS'), // second pass (first after last failing) + ] + const required = new Set(['CI / test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then lastFailingSha is sha3 (the LAST failure), firstPassingSha is sha4 + expect(result).not.toBeNull() + expect(result?.lastFailingSha).toBe('sha3') + expect(result?.firstPassingSha).toBe('sha4') + }) + + it('StatusContext (legacy status) transition: state=FAILURE → state=SUCCESS', () => { + // #given StatusContext entries (not CheckRun) + const commits: CommitCheckEntry[] = [makeSc('sha1', 'ci/test', 'FAILURE'), makeSc('sha2', 'ci/test', 'SUCCESS')] + const required = new Set(['ci/test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then the transition is found + expect(result).not.toBeNull() + expect(result?.failingCheckName).toBe('ci/test') + expect(result?.lastFailingSha).toBe('sha1') + expect(result?.firstPassingSha).toBe('sha2') + }) + + it('all failing conclusions are recognized: TIMED_OUT, CANCELLED, ACTION_REQUIRED, STARTUP_FAILURE', () => { + // #given each failing conclusion type followed by SUCCESS + const failingConclusions = ['TIMED_OUT', 'CANCELLED', 'ACTION_REQUIRED', 'STARTUP_FAILURE'] + for (const conclusion of failingConclusions) { + const commits: CommitCheckEntry[] = [ + makeCr('sha1', 'CI / test', conclusion), + makeCr('sha2', 'CI / test', 'SUCCESS'), + ] + const required = new Set(['CI / test']) + const result = findFailPassTransition(commits, required) + expect(result).not.toBeNull() + expect(result?.lastFailingSha).toBe('sha1') + } + }) + + it('empty required set → any failed→passed transition counts (no branch protection)', () => { + // #given an empty required set and a check that transitions + const commits: CommitCheckEntry[] = [makeCr('sha1', 'CI / test', 'FAILURE'), makeCr('sha2', 'CI / test', 'SUCCESS')] + const required = new Set() // empty = any check counts + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then the transition is found (any check counts) + expect(result).not.toBeNull() + expect(result?.failingCheckName).toBe('CI / test') + }) + + it('empty commits array → null', () => { + // #given no commits + const result = findFailPassTransition([], new Set(['CI / test'])) + expect(result).toBeNull() + }) + + it('MUTATION PROOF: reversing oldest→newest ordering picks the wrong SHA', () => { + // #given commits in chronological order: sha1=FAILURE, sha2=FAILURE, sha3=SUCCESS + // The correct lastFailingSha is sha2 (the LATEST failure). + // If the array were reversed (newest→oldest), the walk would find sha2 as the + // "first" failure and sha1 as the "first" passing after it — but sha1 is not SUCCESS. + // This fixture proves the ordering invariant is load-bearing. + const commitsOldestFirst: CommitCheckEntry[] = [ + makeCr('sha1', 'CI / test', 'FAILURE'), // older failure + makeCr('sha2', 'CI / test', 'FAILURE'), // newer failure (LAST) + makeCr('sha3', 'CI / test', 'SUCCESS'), // first passing after last failure + ] + const required = new Set(['CI / test']) + + // #when finding the transition with correct ordering + const result = findFailPassTransition(commitsOldestFirst, required) + + // #then lastFailingSha is sha2 (the LATEST failure), firstPassingSha is sha3 + expect(result).not.toBeNull() + expect(result?.lastFailingSha).toBe('sha2') + expect(result?.firstPassingSha).toBe('sha3') + + // #when the array is reversed (newest→oldest — wrong ordering) + const commitsNewestFirst = [...commitsOldestFirst].reverse() + const resultReversed = findFailPassTransition(commitsNewestFirst, required) + + // #then the reversed ordering picks sha3 as lastFailingSha (wrong!) because + // it encounters sha3=SUCCESS first, then sha2=FAILURE (which becomes lastFailingSha), + // then sha1=FAILURE (which also becomes lastFailingSha), and there's no SUCCESS after sha1. + // So the reversed result is null (no transition found after the last failure in reversed order). + // Either way, the result differs from the correct ordering — proving the invariant. + expect(resultReversed?.lastFailingSha).not.toBe('sha2') + }) +}) + +// --------------------------------------------------------------------------- +// I/O shell: harvestCiFixCandidates (mocked gh-exec + octokit) +// --------------------------------------------------------------------------- + +/** + * Build a minimal GraphQL response for fetchPrCommitCheckRollup. + * commits: array of {oid, checks: [{name, conclusion}]} + */ +function makeGraphQLResponse(commits: {oid: string; checks: {name: string; conclusion: string}[]}[]): string { + return JSON.stringify({ + data: { + repository: { + pullRequest: { + commits: { + pageInfo: {hasNextPage: false, endCursor: null}, + nodes: commits.map(c => ({ + commit: { + oid: c.oid, + statusCheckRollup: { + contexts: { + nodes: c.checks.map(ch => ({ + __typename: 'CheckRun', + name: ch.name, + conclusion: ch.conclusion, + })), + }, + }, + }, + })), + }, + }, + }, + }, + }) +} + +/** Build a merged PR item for harvestCiFixCandidates */ +function makeMergedPrItem( + overrides: { + number?: number + merge_commit_sha?: string + title?: string + labels?: {name: string}[] + user?: {login: string} | null + } = {}, +) { + return { + number: overrides.number ?? 1, + merge_commit_sha: overrides.merge_commit_sha ?? 'merge001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + title: overrides.title ?? 'fix: correct the CI failure', + labels: overrides.labels ?? [], + user: overrides.user ?? {login: 'some-human'}, + } +} + +/** Build a mock OctokitClient for harvestCiFixCandidates */ +function mockOctokitForCiFix( + overrides: { + compareCommits?: () => Promise<{data: {files: {filename: string; patch: string}[]}}> + listWorkflowRunsForRepo?: () => Promise<{data: {workflow_runs: {id: number; conclusion: string}[]}}> + listJobsForWorkflowRun?: () => Promise<{data: {jobs: {id: number; conclusion: string}[]}}> + downloadJobLogsForWorkflowRun?: () => Promise<{data: string}> + } = {}, +): OctokitClient { + return { + paginate: async () => [], + rest: { + pulls: {list: async () => ({data: []}), listReviews: async () => ({data: []})}, + issues: {listForRepo: async () => ({data: []})}, + repos: { + compareCommits: + overrides.compareCommits ?? + (async () => ({ + data: { + files: [{filename: 'scripts/foo.ts', patch: '@@ -1 +1 @@\n-bad\n+good'}], + }, + })), + }, + actions: { + listWorkflowRunsForRepo: + overrides.listWorkflowRunsForRepo ?? + (async () => ({data: {workflow_runs: [{id: 42, conclusion: 'failure'}]}})), + listJobsForWorkflowRun: + overrides.listJobsForWorkflowRun ?? (async () => ({data: {jobs: [{id: 99, conclusion: 'failure'}]}})), + downloadJobLogsForWorkflowRun: + overrides.downloadJobLogsForWorkflowRun ?? (async () => ({data: 'Error: test failed\nsome other output'})), + }, + }, + } as unknown as OctokitClient +} + +describe('harvestCiFixCandidates', () => { + it('happy path: PR with fail→pass transition → one CiFixCandidate with correct SHAs and diffExcerpt', async () => { + // #given a merged PR with a failing commit then a passing commit + const pr = makeMergedPrItem({ + number: 1, + merge_commit_sha: 'merge001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + }) + const ghExec: GhExecFn = () => + makeGraphQLResponse([ + {oid: 'sha1fail', checks: [{name: 'CI / test', conclusion: 'FAILURE'}]}, + {oid: 'sha2pass', checks: [{name: 'CI / test', conclusion: 'SUCCESS'}]}, + ]) + const octokit = mockOctokitForCiFix() + + // #when harvesting + const result = await harvestCiFixCandidates( + octokit, + 'fro-bot', + '.github', + new Date(), + [pr], + new Set(), // empty = any check counts + ghExec, + ) + + // #then one candidate is returned + expect(result.candidates).toHaveLength(1) + const candidate = result.candidates[0] + expect(candidate?.trigger).toBe('ci-fail-then-pass') + expect(candidate?.mergeSha).toBe('merge001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1') + expect(candidate?.failingCheckName).toBe('CI / test') + expect(candidate?.lastFailingSha).toBe('sha1fail') + expect(candidate?.firstPassingSha).toBe('sha2pass') + expect(candidate?.diffExcerpt).toContain('-bad') + expect(candidate?.diffExcerpt).toContain('+good') + // #then telemetry counts are correct + expect(result.ciFixPrsExamined).toBe(1) + expect(result.ciFixCandidates).toBe(1) + }) + + it('PR that never failed → no candidate', async () => { + // #given a PR where the check always passes + const pr = makeMergedPrItem() + const ghExec: GhExecFn = () => + makeGraphQLResponse([ + {oid: 'sha1', checks: [{name: 'CI / test', conclusion: 'SUCCESS'}]}, + {oid: 'sha2', checks: [{name: 'CI / test', conclusion: 'SUCCESS'}]}, + ]) + const octokit = mockOctokitForCiFix() + + // #when harvesting + const result = await harvestCiFixCandidates(octokit, 'fro-bot', '.github', new Date(), [pr], new Set(), ghExec) + + // #then no candidate + expect(result.candidates).toHaveLength(0) + expect(result.ciFixCandidates).toBe(0) + }) + + it('PR that failed and stayed failing → no candidate', async () => { + // #given a PR where the check fails and never recovers + const pr = makeMergedPrItem() + const ghExec: GhExecFn = () => + makeGraphQLResponse([ + {oid: 'sha1', checks: [{name: 'CI / test', conclusion: 'FAILURE'}]}, + {oid: 'sha2', checks: [{name: 'CI / test', conclusion: 'FAILURE'}]}, + ]) + const octokit = mockOctokitForCiFix() + + // #when harvesting + const result = await harvestCiFixCandidates(octokit, 'fro-bot', '.github', new Date(), [pr], new Set(), ghExec) + + // #then no candidate + expect(result.candidates).toHaveLength(0) + }) + + it('bare re-run: compareCommits returns empty files → candidate dropped', async () => { + // #given a PR with a transition but compareCommits returns no files (bare re-run) + const pr = makeMergedPrItem() + const ghExec: GhExecFn = () => + makeGraphQLResponse([ + {oid: 'sha1', checks: [{name: 'CI / test', conclusion: 'FAILURE'}]}, + {oid: 'sha2', checks: [{name: 'CI / test', conclusion: 'SUCCESS'}]}, + ]) + const octokit = mockOctokitForCiFix({ + compareCommits: async () => ({data: {files: []}}), + }) + + // #when harvesting + const result = await harvestCiFixCandidates(octokit, 'fro-bot', '.github', new Date(), [pr], new Set(), ghExec) + + // #then candidate is dropped (no real fixing diff) + expect(result.candidates).toHaveLength(0) + }) + + it('logs purged: downloadJobLogsForWorkflowRun throws → logExcerpt placeholder, candidate still emitted', async () => { + // #given a PR with a transition and a log fetch that throws + const pr = makeMergedPrItem() + const ghExec: GhExecFn = () => + makeGraphQLResponse([ + {oid: 'sha1', checks: [{name: 'CI / test', conclusion: 'FAILURE'}]}, + {oid: 'sha2', checks: [{name: 'CI / test', conclusion: 'SUCCESS'}]}, + ]) + const octokit = mockOctokitForCiFix({ + downloadJobLogsForWorkflowRun: async () => { + throw Object.assign(new Error('Not Found'), {status: 404}) + }, + }) + + // #when harvesting + const result = await harvestCiFixCandidates(octokit, 'fro-bot', '.github', new Date(), [pr], new Set(), ghExec) + + // #then candidate is still emitted (log failure doesn't block) + expect(result.candidates).toHaveLength(1) + // #then logExcerpt is the placeholder + expect(result.candidates[0]?.logExcerpt).toBe('[failure log purged or unavailable]') + // #then diffExcerpt is still populated + expect(result.candidates[0]?.diffExcerpt).toContain('-bad') + }) + + it('GraphQL error for one PR degrades it, others proceed', async () => { + // #given two PRs: first GraphQL call throws, second succeeds + const pr1 = makeMergedPrItem({number: 1, merge_commit_sha: 'merge001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1'}) + const pr2 = makeMergedPrItem({number: 2, merge_commit_sha: 'merge002aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1'}) + + let callCount = 0 + const ghExec: GhExecFn = () => { + callCount++ + if (callCount === 1) throw new Error('GraphQL error') + return makeGraphQLResponse([ + {oid: 'sha1', checks: [{name: 'CI / test', conclusion: 'FAILURE'}]}, + {oid: 'sha2', checks: [{name: 'CI / test', conclusion: 'SUCCESS'}]}, + ]) + } + const octokit = mockOctokitForCiFix() + + // #when harvesting + const result = await harvestCiFixCandidates( + octokit, + 'fro-bot', + '.github', + new Date(), + [pr1, pr2], + new Set(), + ghExec, + ) + + // #then only the second PR produces a candidate (first was degraded) + expect(result.candidates).toHaveLength(1) + expect(result.candidates[0]?.mergeSha).toBe('merge002aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1') + expect(result.ciFixPrsExamined).toBe(2) + expect(result.ciFixCandidates).toBe(1) + }) + + it('PR with null statusCheckRollup commits → no candidate', async () => { + // #given a PR where commits have no statusCheckRollup (null) + const pr = makeMergedPrItem() + const ghExec: GhExecFn = () => + JSON.stringify({ + data: { + repository: { + pullRequest: { + commits: { + pageInfo: {hasNextPage: false, endCursor: null}, + nodes: [ + {commit: {oid: 'sha1', statusCheckRollup: null}}, + {commit: {oid: 'sha2', statusCheckRollup: null}}, + ], + }, + }, + }, + }, + }) + const octokit = mockOctokitForCiFix() + + // #when harvesting + const result = await harvestCiFixCandidates(octokit, 'fro-bot', '.github', new Date(), [pr], new Set(), ghExec) + + // #then no candidate (no check data to find a transition) + expect(result.candidates).toHaveLength(0) + }) + + it('required-checks empty → any transition counts', async () => { + // #given an empty required set and a PR with a transition on any check + const pr = makeMergedPrItem() + const ghExec: GhExecFn = () => + makeGraphQLResponse([ + {oid: 'sha1', checks: [{name: 'some-arbitrary-check', conclusion: 'FAILURE'}]}, + {oid: 'sha2', checks: [{name: 'some-arbitrary-check', conclusion: 'SUCCESS'}]}, + ]) + const octokit = mockOctokitForCiFix() + + // #when harvesting with empty required set + const result = await harvestCiFixCandidates(octokit, 'fro-bot', '.github', new Date(), [pr], new Set(), ghExec) + + // #then the candidate is found (any check counts) + expect(result.candidates).toHaveLength(1) + expect(result.candidates[0]?.failingCheckName).toBe('some-arbitrary-check') + }) +}) + +// --------------------------------------------------------------------------- +// Privacy: CiFix evidence scanned in buildCandidateDigest (mutation proof) +// --------------------------------------------------------------------------- + +describe('buildCandidateDigest — CiFix evidence privacy scan (mutation proof)', () => { + it('MUTATION PROOF: CiFix candidate with a ghp_ secret in diffExcerpt → evidence cleared, enrichmentBlockedBySecret incremented', () => { + // #given a ci-fix candidate whose diffExcerpt contains a GitHub PAT + // This test proves the secret scan is wired for CiFix evidence. + const sha = 'cifixsec1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const secretDiff = + '--- a/config.ts\n+++ b/config.ts\n@@ -1 +1 @@\n-old\n+token=ghp_AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA1' + const candidate = makeCiFixCandidate({ + mergeSha: sha, + diffExcerpt: secretDiff, + }) + const input = makeDigestInput({ + mergedPrs: [candidate], + privateTokens: new Set(), // no private-name tokens — only secret scan fires + }) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then the candidate is kept (not dropped) + expect(result.candidates).toHaveLength(1) + // #then diffExcerpt is cleared (secret blocked) + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].diffExcerpt).toBe('') + expect(result.candidates[0].logExcerpt).toBeUndefined() + } + // #then enrichmentBlockedBySecret is incremented + expect(result.telemetry.enrichmentBlockedBySecret).toBe(1) + // #then the secret does NOT appear in the serialized digest + const serialized = JSON.stringify(result) + expect(serialized).not.toContain('ghp_') + }) + + it('MUTATION PROOF: removing the CiFix secret scan lets the ghp_ token reach the digest', () => { + // #given a ci-fix candidate with a PAT in diffExcerpt + const sha = 'cifixsec2aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const secretDiff = 'token=ghp_AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA2' + const candidate = makeCiFixCandidate({mergeSha: sha, diffExcerpt: secretDiff}) + + // #when the scan IS applied (normal path) + const withScan = buildCandidateDigest(makeDigestInput({mergedPrs: [candidate], privateTokens: new Set()})) + // #then the secret is NOT in the output + expect(JSON.stringify(withScan)).not.toContain('ghp_') + expect(withScan.telemetry.enrichmentBlockedBySecret).toBe(1) + + // #when a clean candidate is used (simulating scan bypass) + const cleanCandidate = makeCiFixCandidate({mergeSha: sha, diffExcerpt: '-bad\n+good'}) + const withoutSecret = buildCandidateDigest(makeDigestInput({mergedPrs: [cleanCandidate], privateTokens: new Set()})) + // #then the clean diff DOES appear — proving the scan only blocks secrets + if (withoutSecret.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(withoutSecret.candidates[0].diffExcerpt).toContain('-bad') + } + expect(withoutSecret.telemetry.enrichmentBlockedBySecret).toBe(0) + }) + + it('CiFix candidate with private-name in diffExcerpt → evidence cleared, enrichmentBlocked incremented', () => { + // #given a ci-fix candidate whose diffExcerpt contains a private repo name + const privateTokens = buildPrivateTokenSet(['testowner/secret-repo']) + const sha = 'cifixpriv1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const candidate = makeCiFixCandidate({ + mergeSha: sha, + diffExcerpt: 'See testowner/secret-repo for the fix.', + }) + const input = makeDigestInput({mergedPrs: [candidate], privateTokens}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then enrichmentBlocked is incremented (private-name counter) + expect(result.telemetry.enrichmentBlocked).toBe(1) + expect(result.telemetry.enrichmentBlockedBySecret).toBe(0) + // #then diffExcerpt is cleared + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].diffExcerpt).toBe('') + } + // #then private name does not appear in serialized output + expect(JSON.stringify(result)).not.toContain('testowner/secret-repo') + }) + + it('CiFix candidate with path in diffExcerpt → path redacted, candidate still emitted', () => { + // #given a ci-fix candidate whose diffExcerpt contains a file path (redact-class) + const sha = 'cifixpath1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const candidate = makeCiFixCandidate({ + mergeSha: sha, + diffExcerpt: 'Config loaded from /Users/marcus/.ssh/config for the build.', + }) + const input = makeDigestInput({mergedPrs: [candidate], privateTokens: new Set()}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then the candidate is emitted (not blocked) + expect(result.candidates).toHaveLength(1) + // #then the path is redacted in the output + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].diffExcerpt).not.toContain('/Users/marcus') + expect(result.candidates[0].diffExcerpt).toContain('[REDACTED]') + } + // #then neither counter is incremented (redaction, not blocking) + expect(result.telemetry.enrichmentBlocked).toBe(0) + expect(result.telemetry.enrichmentBlockedBySecret).toBe(0) + }) + + it('CiFix candidate with secret in logExcerpt → evidence cleared', () => { + // #given a ci-fix candidate whose logExcerpt contains a GitHub PAT + const sha = 'cifixlog1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const candidate = makeCiFixCandidate({ + mergeSha: sha, + diffExcerpt: '-bad\n+good', + logExcerpt: 'Error: auth failed with token ghp_AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA3', + }) + const input = makeDigestInput({mergedPrs: [candidate], privateTokens: new Set()}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then enrichmentBlockedBySecret is incremented + expect(result.telemetry.enrichmentBlockedBySecret).toBe(1) + // #then both diffExcerpt and logExcerpt are cleared + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].diffExcerpt).toBe('') + expect(result.candidates[0].logExcerpt).toBeUndefined() + } + // #then the secret does not appear in serialized output + expect(JSON.stringify(result)).not.toContain('ghp_') + }) +}) diff --git a/scripts/capture-learnings-harvest.ts b/scripts/capture-learnings-harvest.ts index e1d097888..c3d47ef8a 100644 --- a/scripts/capture-learnings-harvest.ts +++ b/scripts/capture-learnings-harvest.ts @@ -11,6 +11,7 @@ */ import type {Dirent} from 'node:fs' +import {execFileSync} from 'node:child_process' import {readdir, readFile, writeFile} from 'node:fs/promises' import process from 'node:process' import {Octokit} from '@octokit/rest' @@ -157,20 +158,31 @@ export interface ReviewCandidate { /** * A candidate sourced from a PR whose CI checks transitioned failed → passed before merge. * - * Evidence fields (failingCheckName, lastFailingSha, firstPassingSha, diffExcerpt, - * logExcerpt) are populated in Unit 3. This placeholder carries the shared fields - * so the union type and digest pipeline are ready before the harvester is implemented. + * Evidence fields carry the fixing diff (primary signal, always available) and a + * best-effort failing-log excerpt. Both are privacy-scanned before reaching the digest. */ export interface CiFixCandidate { trigger: 'ci-fail-then-pass' mergeSha: string signals: CandidateSignals - // Evidence fields populated in Unit 3: - // failingCheckName: string - // lastFailingSha: string - // firstPassingSha: string - // diffExcerpt: string - // logExcerpt?: string + /** Name of the check that transitioned failed → passed. */ + failingCheckName: string + /** SHA of the last commit where the check was in a failing conclusion. */ + lastFailingSha: string + /** SHA of the first commit after lastFailingSha where the check is SUCCESS. */ + firstPassingSha: string + /** + * Diff excerpt between lastFailingSha and firstPassingSha, truncated to + * MAX_EXCERPT_CHARS_PER_CANDIDATE. Empty string when no diff was available + * (bare re-run candidates are dropped before this point). + */ + diffExcerpt: string + /** + * Best-effort excerpt from the failing job log, ranked toward error lines. + * '[failure log purged or unavailable]' when the log could not be fetched + * (404/410 purged, no run found, or any other error). + */ + logExcerpt?: string } /** @@ -182,6 +194,10 @@ export interface HarvestStageCounts { mergedPrsInLookback: number excludedAutomation: number multiRoundCandidates: number + /** Number of PRs examined for CI fail→pass transition. */ + ciFixPrsExamined: number + /** Number of CI-fix candidates found (transition detected + real diff). */ + ciFixCandidates: number } /** Counts-only telemetry returned by the pure core + harvest stage. */ @@ -354,7 +370,48 @@ export function buildCandidateDigest(input: BuildCandidateDigestInput): Candidat return {...pr, reviewExcerpts: redactedExcerpts} } - // CiFixCandidate: no evidence fields yet (Unit 3 placeholder) + // CiFixCandidate: scan and redact diffExcerpt + logExcerpt (R3/R3a). + // + // Privacy-ordering invariant: diffExcerpt and logExcerpt are already truncated + // to budget in the I/O shell before this pure core is called. Never move + // truncation after the scan. + // + // Steps: + // 1. Redact structural secrets (paths, hostnames, Bearer tokens) from both fields. + // 2. If private-name leak OR residual hard secret after redaction → clear both + // evidence fields (drop enriched content, keep candidate title-only) and + // increment the appropriate counter. + // 3. Otherwise use the redacted values. + if (pr.trigger === 'ci-fail-then-pass') { + const rawDiff = pr.diffExcerpt + const rawLog = pr.logExcerpt ?? '' + + // Step 1: redact structural secrets + const redactedDiff = redactLogDiffSecrets(rawDiff) + const redactedLog = rawLog === '' ? rawLog : redactLogDiffSecrets(rawLog) + const combinedText = `${redactedDiff}\n${redactedLog}` + + // Step 2: check for private-name leak or residual hard secret + const hasPrivateLeak = learningBodyHasPrivateLeak(combinedText, input.privateTokens) + const hasResidualSecret = logDiffHasSecret(combinedText) + + if (hasPrivateLeak) { + enrichmentBlocked++ + return {...pr, diffExcerpt: '', logExcerpt: undefined} + } + if (hasResidualSecret) { + enrichmentBlockedBySecret++ + return {...pr, diffExcerpt: '', logExcerpt: undefined} + } + + // Step 3: use the redacted values + const result: CiFixCandidate = {...pr, diffExcerpt: redactedDiff} + if (redactedLog !== '') { + return {...result, logExcerpt: redactedLog} + } + return result + } + return pr }) @@ -376,8 +433,8 @@ export function buildCandidateDigest(input: BuildCandidateDigestInput): Candidat * * When `scanAvailable` is false (private token load failed), all enriched evidence * is cleared so no unscanned content reaches the digest. For ReviewCandidate, this - * clears `reviewExcerpts`. For CiFixCandidate, no evidence fields exist yet (Unit 3). - * When `scanAvailable` is true, candidates are returned unchanged. + * clears `reviewExcerpts`. For CiFixCandidate, this clears `diffExcerpt` and + * `logExcerpt`. When `scanAvailable` is true, candidates are returned unchanged. * * Pure function: no I/O, fully unit-testable. Extracted so the fail-closed * composition can be tested independently of the I/O shell. @@ -388,7 +445,9 @@ export function applyEnrichmentScanAvailability(candidates: Candidate[], scanAva if (c.trigger === 'review-heavy') { return {...c, reviewExcerpts: [] as string[]} } - // CiFixCandidate: no evidence fields yet (Unit 3 placeholder) + if (c.trigger === 'ci-fail-then-pass') { + return {...c, diffExcerpt: '', logExcerpt: undefined} + } return c }) } @@ -660,6 +719,484 @@ function buildReviewExcerpts(reviewBodies: {state: string; body: string}[], thre return excerpts } +// --------------------------------------------------------------------------- +// CI fail→pass transition detection (pure, exported for testing) +// --------------------------------------------------------------------------- + +/** + * Per-commit per-check conclusion entry, as parsed from the GraphQL statusCheckRollup. + * Supports both CheckRun (name + conclusion) and StatusContext (context + state). + */ +export type CommitCheckEntry = + | {type: 'CheckRun'; sha: string; name: string; conclusion: string} + | {type: 'StatusContext'; sha: string; context: string; state: string} + +/** + * Result of a successful fail→pass transition detection. + */ +export interface FailPassTransition { + failingCheckName: string + lastFailingSha: string + firstPassingSha: string +} + +/** + * Failing conclusions for CheckRun (GitHub Actions). + * These are the states that count as "failed" for transition detection. + */ +const FAILING_CHECK_CONCLUSIONS = new Set(['FAILURE', 'TIMED_OUT', 'CANCELLED', 'ACTION_REQUIRED', 'STARTUP_FAILURE']) + +/** + * Find the first fail→pass transition across a set of commits for required checks. + * + * Takes commits ordered OLDEST→NEWEST (chronological ascending). For each required + * check name, finds: + * - lastFailingSha: the LATEST commit where the check has a failing conclusion + * - firstPassingSha: the FIRST commit AFTER lastFailingSha where the check is SUCCESS + * + * Returns the first such transition found (iterating required checks in insertion order), + * or null if no transition exists. + * + * When requiredCheckNames is empty, treats ALL checks as required (any failed→passed + * transition counts). This handles repos with no branch protection configured. + * + * CheckRun matches by `name`; StatusContext matches by `context`. + * + * Pure function: no I/O, fully unit-testable. The ordering invariant (oldest→newest) + * is the correctness core — reversing the order would pick the wrong SHA. + */ +export function findFailPassTransition( + commits: CommitCheckEntry[], + requiredCheckNames: Set, +): FailPassTransition | null { + // Collect all check names present in the commits + const allCheckNames = new Set() + for (const entry of commits) { + const name = entry.type === 'CheckRun' ? entry.name : entry.context + allCheckNames.add(name) + } + + // Determine which check names to evaluate + const checkNamesToEvaluate = requiredCheckNames.size > 0 ? requiredCheckNames : allCheckNames + + for (const checkName of checkNamesToEvaluate) { + // Walk commits oldest→newest to find lastFailingSha and firstPassingSha + let lastFailingSha: string | null = null + + for (const entry of commits) { + const entryName = entry.type === 'CheckRun' ? entry.name : entry.context + if (entryName !== checkName) continue + + if (entry.type === 'CheckRun') { + if (FAILING_CHECK_CONCLUSIONS.has(entry.conclusion)) { + // Update lastFailingSha — we want the LATEST failing commit + lastFailingSha = entry.sha + } + } else if (entry.state === 'FAILURE') { + // StatusContext: FAILURE state counts as failing + lastFailingSha = entry.sha + } + } + + if (lastFailingSha === null) continue + + // Now find the FIRST commit AFTER lastFailingSha where the check is SUCCESS + let foundFailing = false + for (const entry of commits) { + const entryName = entry.type === 'CheckRun' ? entry.name : entry.context + if (entryName !== checkName) continue + + if (entry.sha === lastFailingSha) { + foundFailing = true + continue + } + + if (!foundFailing) continue + + // We're past the lastFailingSha — look for first SUCCESS + const isSuccess = entry.type === 'CheckRun' ? entry.conclusion === 'SUCCESS' : entry.state === 'SUCCESS' + + if (isSuccess) { + return { + failingCheckName: checkName, + lastFailingSha, + firstPassingSha: entry.sha, + } + } + } + } + + return null +} + +// --------------------------------------------------------------------------- +// GraphQL commit/check rollup fetch (I/O) +// --------------------------------------------------------------------------- + +/** + * GraphQL query to fetch per-commit per-check conclusions for a PR. + * Returns commits in chronological order (oldest→newest for commits(first:N)). + * Pagination: cursor walk for PRs with >100 commits. + */ +const PR_COMMITS_ROLLUP_QUERY = ` +query($owner:String!,$repo:String!,$number:Int!,$after:String){ + repository(owner:$owner,name:$repo){ + pullRequest(number:$number){ + commits(first:100,after:$after){ + pageInfo{ hasNextPage endCursor } + nodes{ + commit{ + oid + statusCheckRollup{ + contexts(first:100){ + nodes{ + __typename + ... on CheckRun{ name conclusion } + ... on StatusContext{ context state } + } + } + } + } + } + } + } + } +} +`.trim() + +/** + * Injectable type for the gh execFileSync call (for testability). + * Matches the signature used in private-repo-resolution.ts. + */ +export type GhExecFn = (args: string[], env?: NodeJS.ProcessEnv) => string + +/** + * Default gh exec implementation using execFileSync. + * Follows the pattern from private-repo-resolution.ts. + */ +export function defaultGhExec(args: string[], env?: NodeJS.ProcessEnv): string { + return execFileSync('gh', args, { + encoding: 'utf8', + env: env ?? process.env, + stdio: ['pipe', 'pipe', 'pipe'], + }) +} + +/** + * Fetch per-commit per-check conclusions for a PR via GraphQL. + * Walks cursor pages for PRs with >100 commits. + * Returns commits in chronological order (oldest→newest). + * Returns null on any error (caller degrades the PR). + */ +export function fetchPrCommitCheckRollup( + owner: string, + repo: string, + prNumber: number, + token: string | undefined, + ghExec: GhExecFn = defaultGhExec, +): CommitCheckEntry[] | null { + const env: NodeJS.ProcessEnv = {...process.env} + if (token !== undefined && token !== '') { + env.GH_TOKEN = token + delete env.GITHUB_TOKEN + } + + const allEntries: CommitCheckEntry[] = [] + let after: string | null = null + + // Cursor walk — handles PRs with >100 commits + for (;;) { + const args = [ + 'api', + 'graphql', + '-f', + `query=${PR_COMMITS_ROLLUP_QUERY}`, + '-F', + `owner=${owner}`, + '-F', + `repo=${repo}`, + '-F', + `number=${prNumber}`, + ] + if (after !== null) { + args.push('-f', `after=${after}`) + } + + let stdout: string + try { + stdout = ghExec(args, env) + } catch { + return null + } + + let parsed: unknown + try { + parsed = JSON.parse(stdout) + } catch { + return null + } + + if (!isRecord(parsed)) return null + const data = parsed.data + if (!isRecord(data)) return null + const repository = data.repository + if (!isRecord(repository)) return null + const pullRequest = repository.pullRequest + if (!isRecord(pullRequest)) return null + const commits = pullRequest.commits + if (!isRecord(commits)) return null + const nodes = commits.nodes + if (!Array.isArray(nodes)) return null + + for (const node of nodes) { + if (!isRecord(node)) continue + const commit = node.commit + if (!isRecord(commit)) continue + const oid = commit.oid + if (typeof oid !== 'string') continue + const rollup = commit.statusCheckRollup + if (!isRecord(rollup)) continue + const contexts = rollup.contexts + if (!isRecord(contexts)) continue + const contextNodes = contexts.nodes + if (!Array.isArray(contextNodes)) continue + + for (const ctx of contextNodes) { + if (!isRecord(ctx)) continue + const typename = ctx.__typename + if (typename === 'CheckRun') { + const name = ctx.name + const conclusion = ctx.conclusion + if (typeof name === 'string' && typeof conclusion === 'string') { + allEntries.push({type: 'CheckRun', sha: oid, name, conclusion}) + } + } else if (typename === 'StatusContext') { + const context = ctx.context + const state = ctx.state + if (typeof context === 'string' && typeof state === 'string') { + allEntries.push({type: 'StatusContext', sha: oid, context, state}) + } + } + } + } + + // Check pagination + const pageInfo = commits.pageInfo + if (!isRecord(pageInfo)) break + const hasNextPage = pageInfo.hasNextPage + if (hasNextPage !== true) break + const endCursor = pageInfo.endCursor + if (typeof endCursor !== 'string') break + after = endCursor + } + + return allEntries +} + +// --------------------------------------------------------------------------- +// Diff excerpt builder +// --------------------------------------------------------------------------- + +/** + * Build a diff excerpt from compareCommits file patches, truncated to budget. + * Ranks toward changed hunks (lines starting with + or -). + * Returns empty string when no patches are available. + */ +function buildDiffExcerpt(files: {filename?: string; patch?: string}[], budget: number): string { + const parts: string[] = [] + let remaining = budget + + for (const file of files) { + if (remaining <= 0) break + const patch = file.patch + if (typeof patch !== 'string' || patch === '') continue + const header = `--- ${file.filename ?? 'unknown'}\n` + const chunk = header + patch + if (chunk.length <= remaining) { + parts.push(chunk) + remaining -= chunk.length + } else { + parts.push(chunk.slice(0, remaining)) + remaining = 0 + } + } + + return parts.join('\n') +} + +// --------------------------------------------------------------------------- +// Log excerpt builder +// --------------------------------------------------------------------------- + +/** + * Build a log excerpt from raw job log text, ranked toward error lines. + * Truncates to budget. Returns the excerpt string. + */ +function buildLogExcerpt(logText: string, budget: number): string { + const lines = logText.split('\n') + + // Rank error lines first (lines containing 'error', 'Error', 'ERROR', 'FAILED', 'failed') + const errorLines: string[] = [] + const otherLines: string[] = [] + + for (const line of lines) { + if (/error|failed/i.test(line)) { + errorLines.push(line) + } else { + otherLines.push(line) + } + } + + const ranked = [...errorLines, ...otherLines] + const joined = ranked.join('\n') + return joined.slice(0, budget) +} + +// --------------------------------------------------------------------------- +// CI fix harvester (I/O shell) +// --------------------------------------------------------------------------- + +/** + * Fetch merged PRs in the lookback window and detect CI fail→pass transitions. + * + * For each merged PR (reusing the same closed-PR list pattern as harvestCandidates): + * 1. Run the GraphQL commits/rollup query to get per-commit per-check conclusions. + * 2. Walk commits oldest→newest to find lastFailingSha/firstPassingSha for a required check. + * 3. compareCommits for the fixing diff (drop if no diff — bare re-run). + * 4. Best-effort downloadJobLogsForWorkflowRun for the failed job log excerpt. + * 5. Build CiFixCandidate with deriveSignals for the signals field. + * + * Per-PR errors (GraphQL/compare) degrade that PR (skip), never abort. + * Log fetch errors degrade to '[failure log purged or unavailable]' placeholder. + * + * Injectables: ghExec + octokit + now + mergedPrs (for testability). + * Counts-only telemetry — no owner/repo/name in any log. + */ +export async function harvestCiFixCandidates( + octokit: OctokitClient, + owner: string, + repo: string, + _now: Date, + mergedPrs: { + number: number + merge_commit_sha: string + title: string + labels: {name: string}[] + user: {login: string} | null + }[], + requiredCheckNames: Set, + ghExec: GhExecFn = defaultGhExec, +): Promise<{candidates: CiFixCandidate[]; ciFixPrsExamined: number; ciFixCandidates: number}> { + const token = process.env.GITHUB_TOKEN ?? process.env.GH_TOKEN + + let ciFixPrsExamined = 0 + let ciFixCandidatesCount = 0 + const candidates: CiFixCandidate[] = [] + + for (const pr of mergedPrs) { + ciFixPrsExamined++ + + // Step 1: fetch per-commit per-check conclusions via GraphQL + const entries = fetchPrCommitCheckRollup(owner, repo, pr.number, token, ghExec) + if (entries === null) { + // GraphQL error — degrade this PR, continue + process.stderr.write(`capture-learnings-harvest: ci-fix GraphQL failed (pr=${ciFixPrsExamined})\n`) + continue + } + + // Step 2: find fail→pass transition + const transition = findFailPassTransition(entries, requiredCheckNames) + if (transition === null) { + // No transition — not a CI-fix candidate + continue + } + + const {failingCheckName, lastFailingSha, firstPassingSha} = transition + + // Step 3: compareCommits for the fixing diff + let diffExcerpt: string + try { + const compareResult = await octokit.rest.repos.compareCommits({ + owner, + repo, + base: lastFailingSha, + head: firstPassingSha, + }) + const files = compareResult.data.files ?? [] + if (files.length === 0) { + // Bare re-run with no diff — drop the candidate (scope boundary) + continue + } + diffExcerpt = buildDiffExcerpt(files as {filename?: string; patch?: string}[], MAX_EXCERPT_CHARS_PER_CANDIDATE) + if (diffExcerpt === '') { + // No patches in any file — drop + continue + } + } catch { + // compareCommits error — degrade this PR + process.stderr.write(`capture-learnings-harvest: ci-fix compareCommits failed (pr=${ciFixPrsExamined})\n`) + continue + } + + // Step 4: best-effort log fetch + let logExcerpt: string | undefined + try { + const runsResult = await octokit.rest.actions.listWorkflowRunsForRepo({ + owner, + repo, + head_sha: lastFailingSha, + status: 'completed', + per_page: 10, + }) + const failedRun = runsResult.data.workflow_runs.find(r => r.conclusion === 'failure') + if (failedRun !== undefined) { + const jobsResult = await octokit.rest.actions.listJobsForWorkflowRun({ + owner, + repo, + run_id: failedRun.id, + per_page: 50, + }) + const failedJob = jobsResult.data.jobs.find(j => j.conclusion === 'failure') + if (failedJob !== undefined) { + const logResponse = await octokit.rest.actions.downloadJobLogsForWorkflowRun({ + owner, + repo, + job_id: failedJob.id, + }) + const logText = typeof logResponse.data === 'string' ? logResponse.data : '' + if (logText !== '') { + logExcerpt = buildLogExcerpt(logText, MAX_EXCERPT_CHARS_PER_CANDIDATE) + } + } + } + } catch { + // Log fetch failed — degrade to placeholder, never block the candidate + logExcerpt = '[failure log purged or unavailable]' + } + + // Step 5: build CiFixCandidate + const prLabels = pr.labels.map(l => l.name) + const candidate: CiFixCandidate = { + trigger: 'ci-fail-then-pass', + mergeSha: pr.merge_commit_sha, + signals: deriveSignals({ + title: pr.title, + labels: prLabels.map(name => ({name})), + }), + failingCheckName, + lastFailingSha, + firstPassingSha, + diffExcerpt, + logExcerpt, + } + + candidates.push(candidate) + ciFixCandidatesCount++ + } + + return {candidates, ciFixPrsExamined, ciFixCandidates: ciFixCandidatesCount} +} + /** * Fetch all merged PRs in the lookback window and return those where Fro Bot's * substantive reviews meet the predicate: @@ -828,6 +1365,9 @@ export async function harvestCandidates( mergedPrsInLookback, excludedAutomation, multiRoundCandidates: candidates.length, + // CI-fix counts are populated by harvestCiFixCandidates separately + ciFixPrsExamined: 0, + ciFixCandidates: 0, }, } } diff --git a/scripts/capture-learnings-open.test.ts b/scripts/capture-learnings-open.test.ts index 7cd27d514..848828c98 100644 --- a/scripts/capture-learnings-open.test.ts +++ b/scripts/capture-learnings-open.test.ts @@ -16,6 +16,7 @@ import { LEARNING_PROPOSAL_LABEL, type Candidate, type OctokitClient, + type ReviewCandidate, } from './capture-learnings-harvest.ts' import { ensureLabelsExist, @@ -32,7 +33,7 @@ import {buildPrivateTokenSet} from './wiki-slug.ts' // Fixture helpers // --------------------------------------------------------------------------- -function makeCandidate(overrides: Partial = {}): Candidate { +function makeCandidate(overrides: Partial = {}): Candidate { return { trigger: 'review-heavy', mergeSha: 'abc123def456abc123def456abc123def456abc1', From d8bd78a31323daf89ffbe83be5e4bf247c38e7a0 Mon Sep 17 00:00:00 2001 From: "Marcus R. Brown" Date: Tue, 23 Jun 2026 15:35:19 -0700 Subject: [PATCH 5/7] feat(capture): wire the failed-then-fixed source into the run and prompt Fetch merged PRs once and feed both harvesters, resolve the required-check set from branch protection (falling back to all checks when none are configured), and concatenate both sources into the digest. The agent prompt now branches on the trigger so a failed-then-fixed candidate is distilled from its fixing diff and failure log, and the run summary reports the failed-then-fixed counts and the secret-blocked count. --- .github/workflows/capture-learnings.yaml | 66 +++-- ...-feat-c2-failed-then-fixed-capture-plan.md | 8 +- scripts/capture-learnings-harvest.test.ts | 235 ++++++++++++++++++ scripts/capture-learnings-harvest.ts | 143 ++++++++++- 4 files changed, 424 insertions(+), 28 deletions(-) diff --git a/.github/workflows/capture-learnings.yaml b/.github/workflows/capture-learnings.yaml index 582cf31c0..73930507a 100644 --- a/.github/workflows/capture-learnings.yaml +++ b/.github/workflows/capture-learnings.yaml @@ -91,30 +91,63 @@ jobs: CAPTURE_LEARNINGS_DIGEST_PATH: ${{ runner.temp }}/capture-learnings-digest.json TASK_PROMPT: | You are authoring learning-proposal bodies. Each body distills a reusable - learning from a pull request that went through multiple substantive review rounds. + learning from a pull request — either one that went through multiple substantive + review rounds, or one whose CI checks transitioned failed → passed before merge. Your ONLY job in this step: 1. Read the harvest digest from $CAPTURE_LEARNINGS_DIGEST_PATH (or equivalently $RUNNER_TEMP/capture-learnings-digest.json). The digest is a JSON object: { "candidates": [...], "telemetry": {...} } - Each candidate in the candidates array has: + Each candidate has a `trigger` field that determines its shape: + + trigger: "review-heavy" candidates have: - mergeSha: opaque merge commit SHA (the ONLY identifier you receive) - - reviewRounds: number of substantive Fro Bot review rounds (approved, changes-requested, or dismissed — a dismissed approval marks a re-review forced by a new push) - - reviewExcerpts: string[] — the actual review prose from the rounds (review bodies and line-level review comments), ranked with the correction feedback first. This is your PRIMARY signal: it is what the reviewer actually said about what was wrong and how it was fixed. It has been privacy-scanned; use it as given. May be empty for some candidates. + - reviewRounds: number of substantive Fro Bot review rounds (approved, + changes-requested, or dismissed — a dismissed approval marks a re-review + forced by a new push) + - reviewExcerpts: string[] — the actual review prose from the rounds (review + bodies and line-level review comments), ranked with the correction feedback + first. This is your PRIMARY signal: it is what the reviewer actually said + about what was wrong and how it was fixed. It has been privacy-scanned; use + it as given. May be empty for some candidates. - signals: { titleTokens: string[], labels: string[] } — secondary context only - 2. For each candidate you judge worth proposing (reviewRounds ≥ 2 is - already guaranteed), write a concise learning-proposal body (2–4 - paragraphs) that: - - Distills the reusable learning from reviewExcerpts — what the - review rounds actually said was wrong and how it was corrected. - Ground the learning in that prose, not in the title tokens. When - reviewExcerpts is empty, fall back to signals but keep the body - modest and clearly general rather than inventing specifics. - - References the source PR by merge SHA ONLY (never look up or - include owner/repo/PR number/title/author — you do not have that - information and must not attempt to retrieve it) + + trigger: "ci-fail-then-pass" candidates have: + - mergeSha: opaque merge commit SHA (the ONLY identifier you receive) + - failingCheckName: name of the CI check that transitioned failed → passed + - diffExcerpt: string — the fixing diff between the last-failing and + first-passing commit. This is your PRIMARY signal: it shows exactly what + change turned the check green. It has been privacy-scanned; use it as given. + May be empty if evidence was blocked by the privacy scan. + - logExcerpt: string (optional) — excerpt from the failing job log, ranked + toward error lines. May be '[failure log purged or unavailable]' when the + log could not be fetched. Secondary signal — the diff is primary. + - signals: { titleTokens: string[], labels: string[] } — tertiary context only + + 2. For each candidate you judge worth proposing, write a concise learning-proposal + body (2–4 paragraphs) that: + + For trigger: "review-heavy" — + - Distills the reusable learning from reviewExcerpts — what the review rounds + actually said was wrong and how it was corrected. Ground the learning in that + prose, not in the title tokens. When reviewExcerpts is empty, fall back to + signals but keep the body modest and clearly general rather than inventing + specifics. + + For trigger: "ci-fail-then-pass" — + - Distills the learning as "this failure → this fix": what the failing check + caught (from failingCheckName + logExcerpt) and how diffExcerpt corrected it. + Ground the learning in diffExcerpt — that is the concrete evidence of what + changed. Do NOT invent specifics from the title tokens. When diffExcerpt is + empty (evidence blocked), keep the body modest and clearly general. + + For all triggers — + - References the source PR by merge SHA ONLY (never look up or include + owner/repo/PR number/title/author — you do not have that information and + must not attempt to retrieve it) - Proposes what the learning would be - Is suitable for a human to review and author into docs/solutions/ + 3. Write the bodies as a JSON object to $RUNNER_TEMP/capture-learnings-bodies.json: { "": "", ... } Include only candidates you judge worth proposing. Omit candidates @@ -172,10 +205,13 @@ jobs: echo "| Merged PRs in lookback | $(jq '.telemetry.mergedPrsInLookback // 0' "$harvest") |" echo "| Excluded automation | $(jq '.telemetry.excludedAutomation // 0' "$harvest") |" echo "| Multi-round candidates | $(jq '.telemetry.multiRoundCandidates // 0' "$harvest") |" + echo "| CI fail-then-pass examined | $(jq '.telemetry.ciFixPrsExamined // 0' "$harvest") |" + echo "| CI fail-then-pass candidates | $(jq '.telemetry.ciFixCandidates // 0' "$harvest") |" echo "| After seen dedup | $(jq '.telemetry.afterSeenDedup // 0' "$harvest") |" echo "| After solutions dedup | $(jq '.telemetry.afterSolutionsDedup // 0' "$harvest") |" echo "| Candidates emitted | $(jq '.telemetry.emitted // 0' "$harvest") |" echo "| Enrichment blocked by privacy | $(jq '.telemetry.enrichmentBlocked // 0' "$harvest") |" + echo "| Enrichment blocked by secret | $(jq '.telemetry.enrichmentBlockedBySecret // 0' "$harvest") |" else echo "| Closed PRs fetched | (harvest result unavailable) |" fi diff --git a/docs/plans/2026-06-22-004-feat-c2-failed-then-fixed-capture-plan.md b/docs/plans/2026-06-22-004-feat-c2-failed-then-fixed-capture-plan.md index 9ca0fee82..93702c8a0 100644 --- a/docs/plans/2026-06-22-004-feat-c2-failed-then-fixed-capture-plan.md +++ b/docs/plans/2026-06-22-004-feat-c2-failed-then-fixed-capture-plan.md @@ -164,7 +164,7 @@ open step — body privacy scan (private-name + secret block) → learning-propo ## Implementation Units -- [ ] **Unit 1: Secret/redaction scan in the shared privacy module** +- [x] **Unit 1: Secret/redaction scan in the shared privacy module** **Goal:** Add log/diff secret detection + structural redaction to the shared gate, callable from both chokepoints, without touching the existing private-repo-name scan. @@ -201,7 +201,7 @@ both chokepoints, without touching the existing private-repo-name scan. **Verification:** gates green; the mutation-proof test bites; no secret value appears in any test assertion message. -- [ ] **Unit 2: Discriminated Candidate union + generalized digest core** +- [x] **Unit 2: Discriminated Candidate union + generalized digest core** **Goal:** Refactor `Candidate` into a `trigger`-discriminated union with per-source `evidence`, and generalize `buildCandidateDigest` to iterate the union uniformly (dedup, cap, privacy), with @@ -243,7 +243,7 @@ test; `buildReviewExcerpts` ordering invariant. **Verification:** gates green; review-source output unchanged; opacity holds for both triggers; the privacy mutation-proof bites. -- [ ] **Unit 3: CI fail→pass harvester (GraphQL detector + diff/log evidence)** +- [x] **Unit 3: CI fail→pass harvester (GraphQL detector + diff/log evidence)** **Goal:** Implement `harvestCiFixCandidates` — detect fail→pass PRs via one GraphQL query each, derive the fixing diff and best-effort log excerpt, and emit `CiFixCandidate`s. @@ -287,7 +287,7 @@ the existing merged-PR fetch + `now` injection; `buildReviewExcerpts` truncate-t **Verification:** gates green; the transition walk is mutation-proven (a fixture where removing the oldest→newest ordering picks the wrong SHA fails); degrade paths don't abort. -- [ ] **Unit 4: Workflow wiring + trigger-branched agent prompt** +- [x] **Unit 4: Workflow wiring + trigger-branched agent prompt** **Goal:** Concatenate both candidate sources into the digest, branch the agent prompt on `trigger`, and surface per-trigger + secret-blocked counts in the step summary. diff --git a/scripts/capture-learnings-harvest.test.ts b/scripts/capture-learnings-harvest.test.ts index 7a934ffad..893da70d1 100644 --- a/scripts/capture-learnings-harvest.test.ts +++ b/scripts/capture-learnings-harvest.test.ts @@ -14,6 +14,7 @@ import { buildCandidateDigest, buildMergeShaMarker, DEPENDENCY_LABELS, + fetchMergedPrsInWindow, fetchOpenedLearningShas, findFailPassTransition, FRO_BOT_REVIEWER_LOGINS, @@ -1492,6 +1493,8 @@ describe('harvest→open schema contract', () => { mergedPrsInLookback: 10, excludedAutomation: 1, multiRoundCandidates: 5, + ciFixPrsExamined: 0, + ciFixCandidates: 0, afterSeenDedup: 3, afterSolutionsDedup: 2, emitted: 2, @@ -2736,3 +2739,235 @@ describe('buildCandidateDigest — CiFix evidence privacy scan (mutation proof)' expect(JSON.stringify(result)).not.toContain('ghp_') }) }) + +// --------------------------------------------------------------------------- +// I/O shell: fetchMergedPrsInWindow (mocked Octokit) +// --------------------------------------------------------------------------- + +function mockOctokitForFetch(prs: unknown[]): OctokitClient { + return { + paginate: async () => prs, + rest: { + pulls: { + list: async () => ({data: []}), + listReviews: async () => ({data: []}), + }, + issues: {listForRepo: async () => ({data: []})}, + }, + } as unknown as OctokitClient +} + +describe('fetchMergedPrsInWindow', () => { + it('returns merged PRs within the lookback window with required fields', async () => { + // #given a merged PR within the lookback window + const sha = 'merge001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const pr = { + number: 1, + merged_at: new Date().toISOString(), + merge_commit_sha: sha, + title: 'fix: correct the CI failure', + labels: [{name: 'ci'}], + user: {login: 'some-human'}, + } + const octokit = mockOctokitForFetch([pr]) + + // #when fetching + const result = await fetchMergedPrsInWindow(octokit, 'fro-bot', '.github', new Date()) + + // #then the PR is returned with the required fields + expect(result).toHaveLength(1) + expect(result[0]?.merge_commit_sha).toBe(sha) + expect(result[0]?.number).toBe(1) + expect(result[0]?.title).toBe('fix: correct the CI failure') + expect(result[0]?.labels).toEqual([{name: 'ci'}]) + expect(result[0]?.user).toEqual({login: 'some-human'}) + expect(result[0]?.merged_at).toBeDefined() + }) + + it('excludes unmerged PRs (merged_at === null)', async () => { + // #given a closed but unmerged PR + const pr = { + number: 1, + merged_at: null, + merge_commit_sha: 'sha001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + title: 'fix: something', + labels: [], + user: {login: 'human'}, + } + const octokit = mockOctokitForFetch([pr]) + + // #when fetching + const result = await fetchMergedPrsInWindow(octokit, 'fro-bot', '.github', new Date()) + + // #then the PR is excluded + expect(result).toHaveLength(0) + }) + + it('excludes PRs merged outside the lookback window', async () => { + // #given a PR merged 60 days ago (beyond LOOKBACK_DAYS=30) + const oldDate = new Date() + oldDate.setDate(oldDate.getDate() - 60) + const pr = { + number: 1, + merged_at: oldDate.toISOString(), + merge_commit_sha: 'sha001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + title: 'fix: old', + labels: [], + user: {login: 'human'}, + } + const octokit = mockOctokitForFetch([pr]) + + // #when fetching + const result = await fetchMergedPrsInWindow(octokit, 'fro-bot', '.github', new Date()) + + // #then the PR is excluded + expect(result).toHaveLength(0) + }) + + it('excludes PRs with null merge_commit_sha', async () => { + // #given a merged PR with no merge commit SHA + const pr = { + number: 1, + merged_at: new Date().toISOString(), + merge_commit_sha: null, + title: 'fix: something', + labels: [], + user: {login: 'human'}, + } + const octokit = mockOctokitForFetch([pr]) + + // #when fetching + const result = await fetchMergedPrsInWindow(octokit, 'fro-bot', '.github', new Date()) + + // #then the PR is excluded + expect(result).toHaveLength(0) + }) + + it('returns multiple PRs within the window', async () => { + // #given two merged PRs within the lookback window + const prs = [ + { + number: 1, + merged_at: new Date().toISOString(), + merge_commit_sha: 'sha001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + title: 'fix: first', + labels: [], + user: {login: 'human'}, + }, + { + number: 2, + merged_at: new Date().toISOString(), + merge_commit_sha: 'sha002aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + title: 'fix: second', + labels: [], + user: {login: 'human'}, + }, + ] + const octokit = mockOctokitForFetch(prs) + + // #when fetching + const result = await fetchMergedPrsInWindow(octokit, 'fro-bot', '.github', new Date()) + + // #then both PRs are returned + expect(result).toHaveLength(2) + }) + + it('handles null user gracefully', async () => { + // #given a merged PR with null user + const pr = { + number: 1, + merged_at: new Date().toISOString(), + merge_commit_sha: 'sha001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1', + title: 'fix: something', + labels: [], + user: null, + } + const octokit = mockOctokitForFetch([pr]) + + // #when fetching + const result = await fetchMergedPrsInWindow(octokit, 'fro-bot', '.github', new Date()) + + // #then the PR is returned with null user + expect(result).toHaveLength(1) + expect(result[0]?.user).toBeNull() + }) +}) + +// --------------------------------------------------------------------------- +// main() wiring: both harvesters concatenated (integration-level unit test) +// --------------------------------------------------------------------------- + +describe('main() wiring: both candidate sources concatenated via buildCandidateDigest', () => { + it('review-heavy + ci-fix candidates from separate harvesters are both fed to buildCandidateDigest', () => { + // #given one review-heavy candidate and one ci-fix candidate (simulating both harvesters) + const reviewCandidate = makeCandidate({mergeSha: 'review01aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1'}) + const ciFixCandidate = makeCiFixCandidate({mergeSha: 'cifix001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1'}) + + // Concatenate as main() does + const allCandidates: Candidate[] = [reviewCandidate, ciFixCandidate] + + // Merge stageCounts as main() does + const reviewStageCounts: HarvestStageCounts = { + closedPrsFetched: 10, + mergedPrsInLookback: 5, + excludedAutomation: 1, + multiRoundCandidates: 1, + ciFixPrsExamined: 0, + ciFixCandidates: 0, + } + const mergedStageCounts: HarvestStageCounts = { + ...reviewStageCounts, + ciFixPrsExamined: 3, + ciFixCandidates: 1, + } + + // #when building the digest with both sources + const result = buildCandidateDigest({ + mergedPrs: allCandidates, + stageCounts: mergedStageCounts, + openedLearningShas: new Set(), + solutionsDocs: [], + maxLearnings: 5, + privateTokens: new Set(), + }) + + // #then both candidates are emitted + expect(result.candidates).toHaveLength(2) + const triggers = result.candidates.map(c => c.trigger).sort() + expect(triggers).toEqual(['ci-fail-then-pass', 'review-heavy']) + + // #then merged stage counts are threaded into telemetry + expect(result.telemetry.multiRoundCandidates).toBe(1) + expect(result.telemetry.ciFixPrsExamined).toBe(3) + expect(result.telemetry.ciFixCandidates).toBe(1) + expect(result.telemetry.closedPrsFetched).toBe(10) + }) + + it('fail-closed fallback telemetry includes ciFixPrsExamined and ciFixCandidates as 0', () => { + // #given the empty digest shape used in the catch block of main() + // This test verifies the shape is consistent (ciFixPrsExamined + ciFixCandidates present) + const empty: CandidateDigest = { + candidates: [], + telemetry: { + closedPrsFetched: 0, + mergedPrsInLookback: 0, + excludedAutomation: 0, + multiRoundCandidates: 0, + ciFixPrsExamined: 0, + ciFixCandidates: 0, + afterSeenDedup: 0, + afterSolutionsDedup: 0, + emitted: 0, + enrichmentBlocked: 0, + enrichmentBlockedBySecret: 0, + }, + } + + // #then the shape is consistent and serializes correctly + const serialized = JSON.stringify(empty) + const parsed = JSON.parse(serialized) as CandidateDigest + expect(parsed.telemetry.ciFixPrsExamined).toBe(0) + expect(parsed.telemetry.ciFixCandidates).toBe(0) + expect(parsed.telemetry.enrichmentBlockedBySecret).toBe(0) + }) +}) diff --git a/scripts/capture-learnings-harvest.ts b/scripts/capture-learnings-harvest.ts index c3d47ef8a..ce845c10e 100644 --- a/scripts/capture-learnings-harvest.ts +++ b/scripts/capture-learnings-harvest.ts @@ -206,6 +206,10 @@ export interface DigestTelemetry { mergedPrsInLookback: number excludedAutomation: number multiRoundCandidates: number + /** Number of PRs examined for CI fail→pass transition. Threaded from HarvestStageCounts. */ + ciFixPrsExamined: number + /** Number of CI-fix candidates found (transition detected + real diff). Threaded from HarvestStageCounts. */ + ciFixCandidates: number afterSeenDedup: number afterSolutionsDedup: number emitted: number @@ -1405,6 +1409,81 @@ export async function fetchOpenedLearningShas( return seen } +// --------------------------------------------------------------------------- +// Shared merged-PR fetch (for harvestCiFixCandidates in main) +// --------------------------------------------------------------------------- + +/** + * Fetch all merged PRs in the lookback window and return the raw list. + * + * This is a separate fetch from harvestCandidates's internal fetch. The two-fetch + * approach was chosen over refactoring harvestCandidates to accept a pre-fetched list + * because harvestCandidates's fetch is deeply entangled with its filtering logic + * (merged_at, cutoff, merge_commit_sha, automation exclusion). Extracting it would + * require changing the function signature and all its tests, increasing blast radius + * with no behavioral benefit. The extra pulls.list call is bounded by LOOKBACK_DAYS + * and is not on the hot path. + * + * Returns PRs with the fields required by harvestCiFixCandidates. + * Exported for testability. + */ +export async function fetchMergedPrsInWindow( + octokit: OctokitClient, + owner: string, + repo: string, + now: Date, +): Promise< + { + number: number + merge_commit_sha: string + title: string + labels: {name: string}[] + user: {login: string} | null + merged_at: string + }[] +> { + const cutoff = new Date(now.getTime() - LOOKBACK_DAYS * 24 * 60 * 60 * 1000) + + const allPrs = await octokit.paginate(octokit.rest.pulls.list, { + owner, + repo, + state: 'closed', + per_page: 100, + } as unknown as Parameters[0]) + + const result: { + number: number + merge_commit_sha: string + title: string + labels: {name: string}[] + user: {login: string} | null + merged_at: string + }[] = [] + + for (const pr of allPrs) { + // Exclude unmerged PRs + if (pr.merged_at === null || pr.merged_at === undefined) continue + + // Exclude PRs outside the lookback window + const mergedAt = new Date(pr.merged_at) + if (mergedAt < cutoff) continue + + // Exclude PRs without a merge commit SHA + if (pr.merge_commit_sha === null || pr.merge_commit_sha === undefined) continue + + result.push({ + number: pr.number, + merge_commit_sha: pr.merge_commit_sha, + title: pr.title, + labels: pr.labels.map(l => ({name: typeof l === 'string' ? l : l.name})), + user: pr.user !== null && pr.user !== undefined ? {login: pr.user.login} : null, + merged_at: pr.merged_at, + }) + } + + return result +} + // --------------------------------------------------------------------------- // Entry point // --------------------------------------------------------------------------- @@ -1418,11 +1497,53 @@ async function main(): Promise { // Use Date.now() — UTC ms — as the reference point for the lookback cutoff. const now = new Date(Date.now()) - const [{candidates: mergedPrs, stageCounts}, openedLearningShas, solutionsFiles] = await Promise.all([ - harvestCandidates(octokit, owner, repo, now), - fetchOpenedLearningShas(octokit, owner, repo), - loadSolutionsFilesFromDisk(), - ]) + // Fetch merged PRs once for the ci-fix harvester (separate from harvestCandidates's + // internal fetch — see fetchMergedPrsInWindow for the rationale). + const [{candidates: reviewCandidates, stageCounts}, openedLearningShas, solutionsFiles, mergedPrsForCiFix] = + await Promise.all([ + harvestCandidates(octokit, owner, repo, now), + fetchOpenedLearningShas(octokit, owner, repo), + loadSolutionsFilesFromDisk(), + fetchMergedPrsInWindow(octokit, owner, repo, now), + ]) + + // Get the required-checks set from branch protection. + // Wrap in try/catch — if branch protection is absent (404) or the call fails, + // use an empty Set (findFailPassTransition treats empty = all checks count). + // Counts-only logging on failure. + let requiredCheckNames = new Set() + try { + const branchProtection = await octokit.rest.repos.getBranchProtection({ + owner, + repo, + branch: 'main', + }) + const checks = branchProtection.data.required_status_checks?.checks ?? [] + requiredCheckNames = new Set(checks.map((c: {context: string}) => c.context)) + } catch { + // 404 = no branch protection configured; any other error = degrade gracefully. + // Empty set means all failed→passed transitions count (documented approximation). + process.stderr.write( + `capture-learnings-harvest: getBranchProtection failed or absent, using empty required-checks set (all transitions count)\n`, + ) + } + + // Harvest CI fail→pass candidates using the shared merged-PR list. + const { + candidates: ciFixCandidates, + ciFixPrsExamined, + ciFixCandidates: ciFixCandidatesCount, + } = await harvestCiFixCandidates(octokit, owner, repo, now, mergedPrsForCiFix, requiredCheckNames) + + // Concatenate both candidate sources. + const allCandidates: Candidate[] = [...reviewCandidates, ...ciFixCandidates] + + // Merge stage counts: add ci-fix counts into the review-harvester's stageCounts. + const mergedStageCounts: HarvestStageCounts = { + ...stageCounts, + ciFixPrsExamined, + ciFixCandidates: ciFixCandidatesCount, + } const solutionsDocs = collectSolutionDocs(solutionsFiles) @@ -1443,13 +1564,15 @@ async function main(): Promise { enrichmentScanAvailable = false } - // If the scan is unavailable, clear all reviewExcerpts before passing to the pure core. + // If the scan is unavailable, clear all enriched evidence before passing to the pure core. // This ensures no unscanned prose reaches the digest under any path. - const safeMergedPrs = applyEnrichmentScanAvailability(mergedPrs, enrichmentScanAvailable) + // applyEnrichmentScanAvailability handles both ReviewCandidate (clears reviewExcerpts) + // and CiFixCandidate (clears diffExcerpt + logExcerpt) — Unit 3 verified this. + const safeCandidates = applyEnrichmentScanAvailability(allCandidates, enrichmentScanAvailable) const digest = buildCandidateDigest({ - mergedPrs: safeMergedPrs, - stageCounts, + mergedPrs: safeCandidates, + stageCounts: mergedStageCounts, openedLearningShas, solutionsDocs, maxLearnings: MAX_LEARNINGS_PER_RUN, @@ -1470,6 +1593,8 @@ async function main(): Promise { mergedPrsInLookback: 0, excludedAutomation: 0, multiRoundCandidates: 0, + ciFixPrsExamined: 0, + ciFixCandidates: 0, afterSeenDedup: 0, afterSolutionsDedup: 0, emitted: 0, From 93a49173d34911af91dfb6ba96ac5ec21227ed06 Mon Sep 17 00:00:00 2001 From: "Marcus R. Brown" Date: Tue, 23 Jun 2026 16:34:28 -0700 Subject: [PATCH 6/7] fix(capture): harden the failed-then-fixed trigger after review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bound each gh invocation with a timeout and cap commit pagination so a hung process can't stall the run. Stop treating cancelled and action-required checks as failures — only real failures qualify — and recognize errored status contexts. Scan the failing check name alongside the diff and log, add the open step's secret check as a final gate before posting, widen the secret patterns (Google, GitLab, JWT, generic credentials), include base64 characters in the bearer redaction, decode buffer log responses, and give failed-then-fixed proposals a neutral title. --- scripts/capture-learnings-harvest.test.ts | 281 +++++++++++++++++++++- scripts/capture-learnings-harvest.ts | 63 +++-- scripts/capture-learnings-open.test.ts | 78 +++++- scripts/capture-learnings-open.ts | 17 +- scripts/capture-learnings-privacy.test.ts | 107 ++++++++ scripts/capture-learnings-privacy.ts | 11 +- 6 files changed, 533 insertions(+), 24 deletions(-) diff --git a/scripts/capture-learnings-harvest.test.ts b/scripts/capture-learnings-harvest.test.ts index 893da70d1..b26411fd3 100644 --- a/scripts/capture-learnings-harvest.test.ts +++ b/scripts/capture-learnings-harvest.test.ts @@ -2249,9 +2249,10 @@ describe('findFailPassTransition', () => { expect(result?.firstPassingSha).toBe('sha2') }) - it('all failing conclusions are recognized: TIMED_OUT, CANCELLED, ACTION_REQUIRED, STARTUP_FAILURE', () => { + it('all failing conclusions are recognized: FAILURE, TIMED_OUT, STARTUP_FAILURE', () => { // #given each failing conclusion type followed by SUCCESS - const failingConclusions = ['TIMED_OUT', 'CANCELLED', 'ACTION_REQUIRED', 'STARTUP_FAILURE'] + // Note: CANCELLED and ACTION_REQUIRED are NOT failing conclusions (FIX 2) + const failingConclusions = ['FAILURE', 'TIMED_OUT', 'STARTUP_FAILURE'] for (const conclusion of failingConclusions) { const commits: CommitCheckEntry[] = [ makeCr('sha1', 'CI / test', conclusion), @@ -2971,3 +2972,279 @@ describe('main() wiring: both candidate sources concatenated via buildCandidateD expect(parsed.telemetry.enrichmentBlockedBySecret).toBe(0) }) }) + +// --------------------------------------------------------------------------- +// FIX 2: FAILING_CHECK_CONCLUSIONS — CANCELLED/ACTION_REQUIRED removed +// --------------------------------------------------------------------------- + +describe('findFailPassTransition — FIX 2: CANCELLED and ACTION_REQUIRED are NOT failing conclusions', () => { + it('CANCELLED→SUCCESS is NOT a candidate (CANCELLED removed from failing set)', () => { + // #given a check that was CANCELLED then SUCCESS + // CANCELLED is a concurrency-cancelled run, not a code failure + const commits: CommitCheckEntry[] = [ + makeCr('sha1', 'CI / test', 'CANCELLED'), + makeCr('sha2', 'CI / test', 'SUCCESS'), + ] + const required = new Set(['CI / test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then no transition found (CANCELLED is not a failing conclusion) + expect(result).toBeNull() + }) + + it('ACTION_REQUIRED→SUCCESS is NOT a candidate (ACTION_REQUIRED removed from failing set)', () => { + // #given a check that was ACTION_REQUIRED then SUCCESS + // ACTION_REQUIRED is a policy gate, not a code failure + const commits: CommitCheckEntry[] = [ + makeCr('sha1', 'CI / test', 'ACTION_REQUIRED'), + makeCr('sha2', 'CI / test', 'SUCCESS'), + ] + const required = new Set(['CI / test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then no transition found (ACTION_REQUIRED is not a failing conclusion) + expect(result).toBeNull() + }) + + it('FAILURE→SUCCESS IS a candidate (FAILURE still in failing set)', () => { + // #given a check that was FAILURE then SUCCESS + const commits: CommitCheckEntry[] = [makeCr('sha1', 'CI / test', 'FAILURE'), makeCr('sha2', 'CI / test', 'SUCCESS')] + const required = new Set(['CI / test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then transition IS found + expect(result).not.toBeNull() + expect(result?.lastFailingSha).toBe('sha1') + expect(result?.firstPassingSha).toBe('sha2') + }) + + it('StatusContext ERROR→SUCCESS IS detected (FIX 2b: ERROR treated as failing)', () => { + // #given a StatusContext entry with state=ERROR then SUCCESS + const commits: CommitCheckEntry[] = [makeSc('sha1', 'ci/test', 'ERROR'), makeSc('sha2', 'ci/test', 'SUCCESS')] + const required = new Set(['ci/test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then the transition IS found (ERROR counts as failing) + expect(result).not.toBeNull() + expect(result?.failingCheckName).toBe('ci/test') + expect(result?.lastFailingSha).toBe('sha1') + expect(result?.firstPassingSha).toBe('sha2') + }) + + it('StatusContext PENDING→SUCCESS is NOT a candidate (PENDING is not failing)', () => { + // #given a StatusContext entry with state=PENDING then SUCCESS + const commits: CommitCheckEntry[] = [makeSc('sha1', 'ci/test', 'PENDING'), makeSc('sha2', 'ci/test', 'SUCCESS')] + const required = new Set(['ci/test']) + + // #when finding the transition + const result = findFailPassTransition(commits, required) + + // #then no transition found (PENDING is not a failing state) + expect(result).toBeNull() + }) +}) + +// --------------------------------------------------------------------------- +// FIX 3: failingCheckName included in privacy scan +// --------------------------------------------------------------------------- + +describe('buildCandidateDigest — FIX 3: failingCheckName scanned for private names', () => { + it('private repo name in failingCheckName → evidence cleared, enrichmentBlocked incremented', () => { + // #given a ci-fix candidate whose failingCheckName contains a private repo name + const privateTokens = buildPrivateTokenSet(['testowner/secret-repo']) + const sha = 'fix3priv1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const candidate = makeCiFixCandidate({ + mergeSha: sha, + failingCheckName: 'testowner/secret-repo CI check', + diffExcerpt: '-bad\n+good', + }) + const input = makeDigestInput({mergedPrs: [candidate], privateTokens}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then enrichmentBlocked is incremented (private-name hit in failingCheckName) + expect(result.telemetry.enrichmentBlocked).toBe(1) + // #then evidence is cleared and failingCheckName is redacted + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].diffExcerpt).toBe('') + expect(result.candidates[0].failingCheckName).toBe('[REDACTED]') + } + // #then private name does not appear in serialized output + expect(JSON.stringify(result)).not.toContain('testowner/secret-repo') + }) + + it('clean failingCheckName passes through unmodified', () => { + // #given a ci-fix candidate with a clean failingCheckName + const sha = 'fix3clean1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const candidate = makeCiFixCandidate({ + mergeSha: sha, + failingCheckName: 'CI / test', + diffExcerpt: '-bad\n+good', + }) + const input = makeDigestInput({mergedPrs: [candidate], privateTokens: new Set()}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then candidate is emitted with failingCheckName intact + expect(result.candidates).toHaveLength(1) + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].failingCheckName).toBe('CI / test') + } + expect(result.telemetry.enrichmentBlocked).toBe(0) + }) +}) + +// --------------------------------------------------------------------------- +// FIX 9: within-run dedup precedence (review-heavy wins regardless of array order) +// --------------------------------------------------------------------------- + +describe('buildCandidateDigest — FIX 9: within-run dedup precedence by trigger, not array order', () => { + it('review-heavy FIRST, ci-fix SECOND → review-heavy wins (array-order-first case)', () => { + // #given review-heavy listed first, ci-fix second — same SHA + const sha = 'dedup001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const reviewCandidate = makeCandidate({mergeSha: sha}) + const ciFixCandidate = makeCiFixCandidate({mergeSha: sha}) + const input = makeDigestInput({mergedPrs: [reviewCandidate, ciFixCandidate]}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then exactly one candidate, review-heavy wins + expect(result.candidates).toHaveLength(1) + expect(result.candidates[0]?.trigger).toBe('review-heavy') + }) + + it('ci-fix FIRST, review-heavy SECOND → review-heavy STILL wins (proves precedence is by trigger, not array order)', () => { + // #given ci-fix listed first, review-heavy second — same SHA + // A 'last-wins' bug would emit ci-fail-then-pass here; correct behavior emits review-heavy + const sha = 'dedup002aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const ciFixCandidate = makeCiFixCandidate({mergeSha: sha}) + const reviewCandidate = makeCandidate({mergeSha: sha}) + const input = makeDigestInput({mergedPrs: [ciFixCandidate, reviewCandidate]}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then exactly one candidate, review-heavy wins regardless of array order + expect(result.candidates).toHaveLength(1) + expect(result.candidates[0]?.trigger).toBe('review-heavy') + }) +}) + +// --------------------------------------------------------------------------- +// FIX 9: applyEnrichmentScanAvailability — ci-fix candidate test +// --------------------------------------------------------------------------- + +describe('applyEnrichmentScanAvailability — FIX 9: ci-fix candidate clears diffExcerpt + logExcerpt', () => { + it('scanAvailable=false clears diffExcerpt and logExcerpt for ci-fix candidates', () => { + // #given a ci-fix candidate with evidence fields + const candidates: CiFixCandidate[] = [ + makeCiFixCandidate({ + mergeSha: `sha1${'0'.repeat(35)}`, + diffExcerpt: '-bad\n+good', + logExcerpt: 'Error: test failed', + }), + ] + + // #when scan is unavailable + const result = applyEnrichmentScanAvailability(candidates, false) + + // #then diffExcerpt is cleared and logExcerpt is undefined + const r0 = result[0] + if (r0?.trigger === 'ci-fail-then-pass') { + expect(r0.diffExcerpt).toBe('') + expect(r0.logExcerpt).toBeUndefined() + } + // #then other fields are preserved + expect(result[0]?.mergeSha).toBe(candidates[0]?.mergeSha) + }) + + it('scanAvailable=true → ci-fix candidate returned unchanged', () => { + // #given a ci-fix candidate with evidence fields + const candidates: CiFixCandidate[] = [ + makeCiFixCandidate({ + mergeSha: `sha1${'0'.repeat(35)}`, + diffExcerpt: '-bad\n+good', + logExcerpt: 'Error: test failed', + }), + ] + + // #when scan is available + const result = applyEnrichmentScanAvailability(candidates, true) + + // #then candidates are returned unchanged (same reference) + expect(result).toBe(candidates) + const r0 = result[0] + if (r0?.trigger === 'ci-fail-then-pass') { + expect(r0.diffExcerpt).toBe('-bad\n+good') + expect(r0.logExcerpt).toBe('Error: test failed') + } + }) +}) + +// --------------------------------------------------------------------------- +// FIX 9: CiFix logExcerpt privacy — private name and path in logExcerpt +// --------------------------------------------------------------------------- + +describe('buildCandidateDigest — FIX 9: CiFix logExcerpt privacy scan', () => { + it('private repo name in logExcerpt (not diffExcerpt) → evidence dropped', () => { + // #given a ci-fix candidate whose logExcerpt contains a private repo name + // but diffExcerpt is clean + const privateTokens = buildPrivateTokenSet(['testowner/secret-repo']) + const sha = 'fix9log1aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const candidate = makeCiFixCandidate({ + mergeSha: sha, + diffExcerpt: '-bad\n+good', + logExcerpt: 'Error: testowner/secret-repo check failed', + }) + const input = makeDigestInput({mergedPrs: [candidate], privateTokens}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then enrichmentBlocked is incremented (private name in logExcerpt) + expect(result.telemetry.enrichmentBlocked).toBe(1) + // #then both diffExcerpt and logExcerpt are cleared + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].diffExcerpt).toBe('') + expect(result.candidates[0].logExcerpt).toBeUndefined() + } + // #then private name does not appear in serialized output + expect(JSON.stringify(result)).not.toContain('testowner/secret-repo') + }) + + it('path in logExcerpt is redacted (not blocked), candidate still emitted', () => { + // #given a ci-fix candidate whose logExcerpt contains a file path (redact-class) + const sha = 'fix9log2aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' + const candidate = makeCiFixCandidate({ + mergeSha: sha, + diffExcerpt: '-bad\n+good', + logExcerpt: 'Config loaded from /Users/marcus/.ssh/config', + }) + const input = makeDigestInput({mergedPrs: [candidate], privateTokens: new Set()}) + + // #when building the digest + const result = buildCandidateDigest(input) + + // #then the candidate is emitted (not blocked) + expect(result.candidates).toHaveLength(1) + // #then the path is redacted in logExcerpt + if (result.candidates[0]?.trigger === 'ci-fail-then-pass') { + expect(result.candidates[0].logExcerpt).not.toContain('/Users/marcus') + expect(result.candidates[0].logExcerpt).toContain('[REDACTED]') + } + // #then neither counter is incremented (redaction, not blocking) + expect(result.telemetry.enrichmentBlocked).toBe(0) + expect(result.telemetry.enrichmentBlockedBySecret).toBe(0) + }) +}) diff --git a/scripts/capture-learnings-harvest.ts b/scripts/capture-learnings-harvest.ts index ce845c10e..42a3ebd9f 100644 --- a/scripts/capture-learnings-harvest.ts +++ b/scripts/capture-learnings-harvest.ts @@ -11,6 +11,7 @@ */ import type {Dirent} from 'node:fs' +import {Buffer} from 'node:buffer' import {execFileSync} from 'node:child_process' import {readdir, readFile, writeFile} from 'node:fs/promises' import process from 'node:process' @@ -390,10 +391,11 @@ export function buildCandidateDigest(input: BuildCandidateDigestInput): Candidat const rawDiff = pr.diffExcerpt const rawLog = pr.logExcerpt ?? '' - // Step 1: redact structural secrets + // Step 1: redact structural secrets (including failingCheckName) const redactedDiff = redactLogDiffSecrets(rawDiff) const redactedLog = rawLog === '' ? rawLog : redactLogDiffSecrets(rawLog) - const combinedText = `${redactedDiff}\n${redactedLog}` + const redactedCheckName = redactLogDiffSecrets(pr.failingCheckName) + const combinedText = `${redactedDiff}\n${redactedLog}\n${redactedCheckName}` // Step 2: check for private-name leak or residual hard secret const hasPrivateLeak = learningBodyHasPrivateLeak(combinedText, input.privateTokens) @@ -401,15 +403,17 @@ export function buildCandidateDigest(input: BuildCandidateDigestInput): Candidat if (hasPrivateLeak) { enrichmentBlocked++ - return {...pr, diffExcerpt: '', logExcerpt: undefined} + // Clear failingCheckName too — it may contain the private name that triggered the block + return {...pr, failingCheckName: '[REDACTED]', diffExcerpt: '', logExcerpt: undefined} } if (hasResidualSecret) { enrichmentBlockedBySecret++ - return {...pr, diffExcerpt: '', logExcerpt: undefined} + // Clear failingCheckName too — it may contain the secret that triggered the block + return {...pr, failingCheckName: '[REDACTED]', diffExcerpt: '', logExcerpt: undefined} } - // Step 3: use the redacted values - const result: CiFixCandidate = {...pr, diffExcerpt: redactedDiff} + // Step 3: use the redacted values (including redacted failingCheckName) + const result: CiFixCandidate = {...pr, diffExcerpt: redactedDiff, failingCheckName: redactedCheckName} if (redactedLog !== '') { return {...result, logExcerpt: redactedLog} } @@ -748,7 +752,7 @@ export interface FailPassTransition { * Failing conclusions for CheckRun (GitHub Actions). * These are the states that count as "failed" for transition detection. */ -const FAILING_CHECK_CONCLUSIONS = new Set(['FAILURE', 'TIMED_OUT', 'CANCELLED', 'ACTION_REQUIRED', 'STARTUP_FAILURE']) +const FAILING_CHECK_CONCLUSIONS = new Set(['FAILURE', 'TIMED_OUT', 'STARTUP_FAILURE']) /** * Find the first fail→pass transition across a set of commits for required checks. @@ -796,8 +800,8 @@ export function findFailPassTransition( // Update lastFailingSha — we want the LATEST failing commit lastFailingSha = entry.sha } - } else if (entry.state === 'FAILURE') { - // StatusContext: FAILURE state counts as failing + } else if (entry.state === 'FAILURE' || entry.state === 'ERROR') { + // StatusContext: FAILURE and ERROR states count as failing lastFailingSha = entry.sha } } @@ -883,6 +887,7 @@ export function defaultGhExec(args: string[], env?: NodeJS.ProcessEnv): string { encoding: 'utf8', env: env ?? process.env, stdio: ['pipe', 'pipe', 'pipe'], + timeout: 30_000, }) } @@ -907,9 +912,13 @@ export function fetchPrCommitCheckRollup( const allEntries: CommitCheckEntry[] = [] let after: string | null = null + const MAX_PAGES = 50 + let pageCount = 0 // Cursor walk — handles PRs with >100 commits for (;;) { + if (pageCount >= MAX_PAGES) break + pageCount++ const args = [ 'api', 'graphql', @@ -930,6 +939,8 @@ export function fetchPrCommitCheckRollup( try { stdout = ghExec(args, env) } catch { + // If we already have entries from earlier pages, return partial data rather than discarding + if (allEntries.length > 0) break return null } @@ -937,20 +948,39 @@ export function fetchPrCommitCheckRollup( try { parsed = JSON.parse(stdout) } catch { + if (allEntries.length > 0) break return null } - if (!isRecord(parsed)) return null + if (!isRecord(parsed)) { + if (allEntries.length > 0) break + return null + } const data = parsed.data - if (!isRecord(data)) return null + if (!isRecord(data)) { + if (allEntries.length > 0) break + return null + } const repository = data.repository - if (!isRecord(repository)) return null + if (!isRecord(repository)) { + if (allEntries.length > 0) break + return null + } const pullRequest = repository.pullRequest - if (!isRecord(pullRequest)) return null + if (!isRecord(pullRequest)) { + if (allEntries.length > 0) break + return null + } const commits = pullRequest.commits - if (!isRecord(commits)) return null + if (!isRecord(commits)) { + if (allEntries.length > 0) break + return null + } const nodes = commits.nodes - if (!Array.isArray(nodes)) return null + if (!Array.isArray(nodes)) { + if (allEntries.length > 0) break + return null + } for (const node of nodes) { if (!isRecord(node)) continue @@ -1167,7 +1197,8 @@ export async function harvestCiFixCandidates( repo, job_id: failedJob.id, }) - const logText = typeof logResponse.data === 'string' ? logResponse.data : '' + const raw = logResponse.data + const logText = typeof raw === 'string' ? raw : Buffer.isBuffer(raw) ? raw.toString('utf8') : '' if (logText !== '') { logExcerpt = buildLogExcerpt(logText, MAX_EXCERPT_CHARS_PER_CANDIDATE) } diff --git a/scripts/capture-learnings-open.test.ts b/scripts/capture-learnings-open.test.ts index 848828c98..862763805 100644 --- a/scripts/capture-learnings-open.test.ts +++ b/scripts/capture-learnings-open.test.ts @@ -565,7 +565,7 @@ describe('openLearningIssues', () => { expect(counts.failed).toBe(0) }) - it('includes the title derived from the mergeSha (short SHA, no private info)', async () => { + it('FIX 7: title does NOT contain "review-heavy" (neutral title for all trigger types)', async () => { // #given a learning with a known mergeSha const sha = 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef' const createSpy = vi.fn().mockResolvedValue({data: {number: 1}}) @@ -580,5 +580,81 @@ describe('openLearningIssues', () => { const callArg = createSpy.mock.calls[0]?.[0] as IssuesCreateCall expect(callArg.title).toContain('deadbeef') expect(callArg.title).toContain('Learning proposal') + // #then the title does NOT hardcode 'review-heavy' (misleading for ci-fix candidates) + expect(callArg.title).not.toContain('review-heavy') + }) +}) + +// --------------------------------------------------------------------------- +// FIX 6: planLearnings blocks bodies containing hard secrets (logDiffHasSecret) +// --------------------------------------------------------------------------- + +describe('planLearnings — FIX 6: logDiffHasSecret as final defense-in-depth gate', () => { + it('blocks a body containing a ghp_ secret (logDiffHasSecret gate)', () => { + // #given an agent-authored body containing a GitHub PAT + const sha = 'abc123def456abc123def456abc123def456abc1' + // Using an obviously-fake token — 'A'.repeat(36) is not a real credential + const body = `This learning references a fix. Token: ghp_${'A'.repeat(36)} was used.` + const input = makePlanInput({ + candidates: [makeCandidate({mergeSha: sha})], + learningBodies: new Map([[sha, body]]), + privateTokens: new Set(), // no private-name tokens — only secret scan fires + }) + + // #when planning + const result = planLearnings(input) + + // #then the candidate is blocked (secret in body) + expect(result.toCreate).toHaveLength(0) + expect(result.blockedOnPrivacy).toBe(1) + }) + + it('MUTATION PROOF: removing logDiffHasSecret check lets the ghp_ secret through', () => { + // #given an agent-authored body containing a GitHub PAT + const sha = 'abc123def456abc123def456abc123def456abc1' + const body = `Learning body with ghp_${'A'.repeat(36)} embedded.` + + // #when planning WITH the secret gate (normal path — non-empty secret in body) + const withGate = planLearnings( + makePlanInput({ + candidates: [makeCandidate({mergeSha: sha})], + learningBodies: new Map([[sha, body]]), + privateTokens: new Set(), + }), + ) + // #then the candidate is blocked + expect(withGate.toCreate).toHaveLength(0) + expect(withGate.blockedOnPrivacy).toBe(1) + + // #when planning WITHOUT the secret (simulated by using a clean body — gate removed) + const cleanBody = 'A clean learning body with no secrets.' + const withoutSecret = planLearnings( + makePlanInput({ + candidates: [makeCandidate({mergeSha: sha})], + learningBodies: new Map([[sha, cleanBody]]), + privateTokens: new Set(), + }), + ) + // #then the clean body passes through — proving the gate was the blocker + expect(withoutSecret.toCreate).toHaveLength(1) + expect(withoutSecret.blockedOnPrivacy).toBe(0) + }) + + it('clean body with no secrets passes through (gate does not over-block)', () => { + // #given a clean agent-authored body + const sha = 'abc123def456abc123def456abc123def456abc1' + const body = 'A clean learning about CI improvements and best practices.' + const input = makePlanInput({ + candidates: [makeCandidate({mergeSha: sha})], + learningBodies: new Map([[sha, body]]), + privateTokens: new Set(), + }) + + // #when planning + const result = planLearnings(input) + + // #then the candidate is included (no secrets, no private names) + expect(result.toCreate).toHaveLength(1) + expect(result.blockedOnPrivacy).toBe(0) }) }) diff --git a/scripts/capture-learnings-open.ts b/scripts/capture-learnings-open.ts index 9c3983509..a9acea329 100644 --- a/scripts/capture-learnings-open.ts +++ b/scripts/capture-learnings-open.ts @@ -29,7 +29,12 @@ import { type CandidateDigest, type OctokitClient, } from './capture-learnings-harvest.ts' -import {isRecord, learningBodyHasPrivateLeak, loadPrivateTokensFromDisk} from './capture-learnings-privacy.ts' +import { + isRecord, + learningBodyHasPrivateLeak, + loadPrivateTokensFromDisk, + logDiffHasSecret, +} from './capture-learnings-privacy.ts' // --------------------------------------------------------------------------- // Types @@ -128,8 +133,8 @@ export function planLearnings(input: PlanLearningsInput): PlanLearningsResult { continue } - // Privacy gate: block if body contains a private identifier - if (learningBodyHasPrivateLeak(body, input.privateTokens)) { + // Privacy gate: block if body contains a private identifier or a hard secret + if (learningBodyHasPrivateLeak(body, input.privateTokens) || logDiffHasSecret(body)) { blockedOnPrivacy += 1 continue } @@ -208,10 +213,14 @@ export async function ensureLabelsExist( * Derive a learning issue title from a mergeSha. * Uses only the merge SHA (public, safe for this public repo) — no owner/repo/number prose. * Short SHA = first 8 chars. + * + * The open step works off agent-authored bodies keyed by mergeSha and does not have + * the trigger at title-derivation time. A neutral title avoids misleading 'review-heavy' + * for ci-fail-then-pass candidates. */ function deriveLearningTitle(mergeSha: string): string { const shortSha = mergeSha.slice(0, 8) - return `Learning proposal: review-heavy PR (${shortSha})` + return `Learning proposal: (${shortSha})` } // --------------------------------------------------------------------------- diff --git a/scripts/capture-learnings-privacy.test.ts b/scripts/capture-learnings-privacy.test.ts index ce6bf873d..0bbaa687c 100644 --- a/scripts/capture-learnings-privacy.test.ts +++ b/scripts/capture-learnings-privacy.test.ts @@ -616,5 +616,112 @@ describe('redactLogDiffSecrets', () => { expect(result).toContain('called') expect(result).toContain('/endpoint') }) + + it('FIX 5: Bearer token with base64 chars (+/) is fully redacted', () => { + // #given a standalone Bearer token containing + and / (valid base64 chars) + // Note: using a standalone Bearer (not inside an Authorization header) so the + // Bearer pattern fires rather than the Authorization header pattern + const body = 'Request sent with Bearer abc+def/ghi12345 to the API' + + // #when redacting + const result = redactLogDiffSecrets(body) + + // #then the token value is fully redacted (no tail survives) + expect(result).not.toContain('abc+def/ghi12345') + expect(result).toContain('Bearer [REDACTED]') + }) + }) +}) + +// --------------------------------------------------------------------------- +// FIX 4: New SECRET_BLOCK_PATTERNS — Google API key, GitLab PAT, JWT, generic creds +// --------------------------------------------------------------------------- + +describe('logDiffHasSecret — FIX 4: new block patterns', () => { + it('detects a Google API key (AIza prefix + 35 chars)', () => { + // #given a body containing a Google API key shape + // Using an obviously-fake key — 'A'.repeat(35) is not a real credential + const body = `GOOGLE_API_KEY=AIza${'A'.repeat(35)}` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('does NOT block a short AIza string (below 35 chars)', () => { + // #given a body with AIza prefix but too short + const body = `AIza${'A'.repeat(10)}` + + // #when scanning + // #then not blocked (too short) + expect(logDiffHasSecret(body)).toBe(false) + }) + + it('detects a GitLab PAT (glpat- prefix + 20+ chars)', () => { + // #given a body containing a GitLab PAT shape + // Using an obviously-fake token — 'B'.repeat(20) is not a real credential + const body = `GITLAB_TOKEN=glpat-${'B'.repeat(20)}` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a JWT (header.payload.signature format)', () => { + // #given a body containing a JWT shape + // Using obviously-fake base64url segments + const header = 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9' + const payload = 'eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIn0' + const sig = 'SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c' + const body = `token=${header}.${payload}.${sig}` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a generic password assignment (password= with 16+ char value)', () => { + // #given a body containing a password assignment + const body = `password=supersecretvalue123456` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a generic secret assignment (secret: with 16+ char value)', () => { + // #given a body containing a secret assignment + const body = `secret: mysupersecretvalue123` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a generic api_key assignment (api_key= with 16+ char value)', () => { + // #given a body containing an api_key assignment + const body = `api_key=abcdefghijklmnop` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('detects a generic access_token assignment (access_token= with 16+ char value)', () => { + // #given a body containing an access_token assignment + const body = `access_token=abcdefghijklmnopqrst` + + // #when scanning + // #then the secret is detected and must block + expect(logDiffHasSecret(body)).toBe(true) + }) + + it('does NOT block a short generic credential value (below 16 chars)', () => { + // #given a body with a short password value (below the 16-char minimum) + const body = `password=short` + + // #when scanning + // #then not blocked (too short to be a real credential) + expect(logDiffHasSecret(body)).toBe(false) }) }) diff --git a/scripts/capture-learnings-privacy.ts b/scripts/capture-learnings-privacy.ts index d365af902..b4ef2547a 100644 --- a/scripts/capture-learnings-privacy.ts +++ b/scripts/capture-learnings-privacy.ts @@ -61,6 +61,14 @@ const SECRET_BLOCK_PATTERNS: RegExp[] = [ /sk-[A-Za-z0-9\-]{20,}/, // Slack tokens /xox[bpars]-[\w-]{10,}/, + // Google API keys + /AIza[\w\-]{35}/, + // GitLab PATs + /glpat-[\w\-]{20,}/, + // JWT (header.payload.signature) + /eyJ[\w\-]+\.eyJ[\w\-]+\.[\w\-]+/, + // Generic credential assignment catch-all (password=, secret=, api_key=, etc.) + /(?:password|passwd|secret|api[._-]?key|auth[._-]?token|access[._-]?token)["']?\s*[=:]\s*["']?[^\s"']{16,}/i, ] /** @@ -97,7 +105,8 @@ export function logDiffHasSecret(body: string): boolean { */ const SECRET_REDACT_PATTERNS: [RegExp, string][] = [ // Bearer tokens (min 8 chars to avoid over-redacting short placeholders) - [/Bearer\s+[\w\-.=]{8,}/g, 'Bearer [REDACTED]'], + // Includes +/ for base64-encoded values + [/Bearer\s+[\w\-.=+/]{8,}/g, 'Bearer [REDACTED]'], // Authorization header values: capture scheme + credential (e.g. 'Basic dXNlcjpwYXNz', 'Bearer token') // Matches two whitespace-delimited tokens after the colon, or one token of 8+ chars. [/Authorization:\s*(?:\S+\s+\S{4,}|\S{8,})/g, 'Authorization: [REDACTED]'], From e0af2cc8c7729b9af06a554b5d000f431dec5370 Mon Sep 17 00:00:00 2001 From: "Marcus R. Brown" Date: Tue, 23 Jun 2026 17:01:48 -0700 Subject: [PATCH 7/7] fix(capture): keep stack-trace context in failing-log excerpts Emit contiguous windows around each error line instead of a global error-first partition, so each error keeps the surrounding stack-trace lines that make the failure legible; merge overlapping windows and fall back to the log head when no error line is present. --- scripts/capture-learnings-harvest.test.ts | 73 +++++++++++++++++++---- scripts/capture-learnings-harvest.ts | 50 ++++++++++++---- scripts/capture-learnings-open.test.ts | 6 +- scripts/capture-learnings-privacy.test.ts | 6 +- 4 files changed, 104 insertions(+), 31 deletions(-) diff --git a/scripts/capture-learnings-harvest.test.ts b/scripts/capture-learnings-harvest.test.ts index b26411fd3..94a2da5d1 100644 --- a/scripts/capture-learnings-harvest.test.ts +++ b/scripts/capture-learnings-harvest.test.ts @@ -12,6 +12,7 @@ import {describe, expect, it, vi} from 'vitest' import { applyEnrichmentScanAvailability, buildCandidateDigest, + buildLogExcerpt, buildMergeShaMarker, DEPENDENCY_LABELS, fetchMergedPrsInWindow, @@ -2142,6 +2143,54 @@ function makeSc(sha: string, context: string, state: string): CommitCheckEntry { return {type: 'StatusContext', sha, context, state} } +describe('buildLogExcerpt', () => { + it('keeps the stack-trace context around an error line, not just the error line', () => { + // #given a log where an error line is surrounded by stack-trace context + const log = [ + 'Run tests', + 'preamble line a', + 'preamble line b', + 'at module.load (foo.ts:10)', + 'at caller (bar.ts:20)', + 'Error: boom happened here', + 'at deeper (baz.ts:30)', + 'at deepest (qux.ts:40)', + 'trailing line', + ].join('\n') + + // #when building the excerpt with a generous budget + const excerpt = buildLogExcerpt(log, 10_000) + + // #then the error line AND its surrounding stack-trace lines are present (context preserved) + expect(excerpt).toContain('Error: boom happened here') + expect(excerpt).toContain('at caller (bar.ts:20)') // before the error + expect(excerpt).toContain('at deeper (baz.ts:30)') // after the error + }) + + it('falls back to the head of the log when no error line is present', () => { + // #given a log with no error/failed lines + const log = Array.from({length: 50}, (_, i) => `info line ${i}`).join('\n') + + // #when building the excerpt + const excerpt = buildLogExcerpt(log, 100) + + // #then it returns the head of the log, truncated to budget + expect(excerpt.startsWith('info line 0')).toBe(true) + expect(excerpt.length).toBeLessThanOrEqual(100) + }) + + it('truncates to budget', () => { + // #given a log with an error and a large surrounding body + const log = ['error: start', ...Array.from({length: 500}, (_, i) => `line ${i}`)].join('\n') + + // #when building with a small budget + const excerpt = buildLogExcerpt(log, 50) + + // #then the result is capped at budget + expect(excerpt.length).toBeLessThanOrEqual(50) + }) +}) + describe('findFailPassTransition', () => { it('happy path: clean fail→pass on a required check returns correct SHAs', () => { // #given commits oldest→newest: sha1=FAILURE, sha2=SUCCESS @@ -2251,7 +2300,7 @@ describe('findFailPassTransition', () => { it('all failing conclusions are recognized: FAILURE, TIMED_OUT, STARTUP_FAILURE', () => { // #given each failing conclusion type followed by SUCCESS - // Note: CANCELLED and ACTION_REQUIRED are NOT failing conclusions (FIX 2) + // Note: CANCELLED and ACTION_REQUIRED are NOT failing conclusions const failingConclusions = ['FAILURE', 'TIMED_OUT', 'STARTUP_FAILURE'] for (const conclusion of failingConclusions) { const commits: CommitCheckEntry[] = [ @@ -2974,10 +3023,10 @@ describe('main() wiring: both candidate sources concatenated via buildCandidateD }) // --------------------------------------------------------------------------- -// FIX 2: FAILING_CHECK_CONCLUSIONS — CANCELLED/ACTION_REQUIRED removed +// FAILING_CHECK_CONCLUSIONS — CANCELLED/ACTION_REQUIRED are not failures // --------------------------------------------------------------------------- -describe('findFailPassTransition — FIX 2: CANCELLED and ACTION_REQUIRED are NOT failing conclusions', () => { +describe('findFailPassTransition — cancelled and action-required are not failing conclusions', () => { it('CANCELLED→SUCCESS is NOT a candidate (CANCELLED removed from failing set)', () => { // #given a check that was CANCELLED then SUCCESS // CANCELLED is a concurrency-cancelled run, not a code failure @@ -3024,7 +3073,7 @@ describe('findFailPassTransition — FIX 2: CANCELLED and ACTION_REQUIRED are NO expect(result?.firstPassingSha).toBe('sha2') }) - it('StatusContext ERROR→SUCCESS IS detected (FIX 2b: ERROR treated as failing)', () => { + it('StatusContext ERROR→SUCCESS IS detected (ERROR treated as failing)', () => { // #given a StatusContext entry with state=ERROR then SUCCESS const commits: CommitCheckEntry[] = [makeSc('sha1', 'ci/test', 'ERROR'), makeSc('sha2', 'ci/test', 'SUCCESS')] const required = new Set(['ci/test']) @@ -3053,10 +3102,10 @@ describe('findFailPassTransition — FIX 2: CANCELLED and ACTION_REQUIRED are NO }) // --------------------------------------------------------------------------- -// FIX 3: failingCheckName included in privacy scan +// failingCheckName included in privacy scan // --------------------------------------------------------------------------- -describe('buildCandidateDigest — FIX 3: failingCheckName scanned for private names', () => { +describe('buildCandidateDigest — failingCheckName scanned for private names', () => { it('private repo name in failingCheckName → evidence cleared, enrichmentBlocked incremented', () => { // #given a ci-fix candidate whose failingCheckName contains a private repo name const privateTokens = buildPrivateTokenSet(['testowner/secret-repo']) @@ -3105,10 +3154,10 @@ describe('buildCandidateDigest — FIX 3: failingCheckName scanned for private n }) // --------------------------------------------------------------------------- -// FIX 9: within-run dedup precedence (review-heavy wins regardless of array order) +// within-run dedup precedence (review-heavy wins regardless of array order) // --------------------------------------------------------------------------- -describe('buildCandidateDigest — FIX 9: within-run dedup precedence by trigger, not array order', () => { +describe('buildCandidateDigest — within-run dedup precedence by trigger, not array order', () => { it('review-heavy FIRST, ci-fix SECOND → review-heavy wins (array-order-first case)', () => { // #given review-heavy listed first, ci-fix second — same SHA const sha = 'dedup001aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1' @@ -3142,10 +3191,10 @@ describe('buildCandidateDigest — FIX 9: within-run dedup precedence by trigger }) // --------------------------------------------------------------------------- -// FIX 9: applyEnrichmentScanAvailability — ci-fix candidate test +// applyEnrichmentScanAvailability — ci-fix candidate test // --------------------------------------------------------------------------- -describe('applyEnrichmentScanAvailability — FIX 9: ci-fix candidate clears diffExcerpt + logExcerpt', () => { +describe('applyEnrichmentScanAvailability — ci-fix candidate clears diffExcerpt + logExcerpt', () => { it('scanAvailable=false clears diffExcerpt and logExcerpt for ci-fix candidates', () => { // #given a ci-fix candidate with evidence fields const candidates: CiFixCandidate[] = [ @@ -3193,10 +3242,10 @@ describe('applyEnrichmentScanAvailability — FIX 9: ci-fix candidate clears dif }) // --------------------------------------------------------------------------- -// FIX 9: CiFix logExcerpt privacy — private name and path in logExcerpt +// CiFix logExcerpt privacy — private name and path in logExcerpt // --------------------------------------------------------------------------- -describe('buildCandidateDigest — FIX 9: CiFix logExcerpt privacy scan', () => { +describe('buildCandidateDigest — CiFix logExcerpt privacy scan', () => { it('private repo name in logExcerpt (not diffExcerpt) → evidence dropped', () => { // #given a ci-fix candidate whose logExcerpt contains a private repo name // but diffExcerpt is clean diff --git a/scripts/capture-learnings-harvest.ts b/scripts/capture-learnings-harvest.ts index 42a3ebd9f..e342973e1 100644 --- a/scripts/capture-learnings-harvest.ts +++ b/scripts/capture-learnings-harvest.ts @@ -1063,27 +1063,51 @@ function buildDiffExcerpt(files: {filename?: string; patch?: string}[], budget: // --------------------------------------------------------------------------- /** - * Build a log excerpt from raw job log text, ranked toward error lines. - * Truncates to budget. Returns the excerpt string. + * Build a log excerpt from raw job log text, centered on error lines with context. + * + * Emits contiguous windows around each error line (a few lines before and after) rather + * than a global error-first partition, so each error keeps its surrounding stack-trace + * context — that context is what makes "this failure → this fix" legible. Windows are + * merged when they overlap, kept in original order, and the result is truncated to budget. + * Falls back to the head of the log when no error line is found. */ -function buildLogExcerpt(logText: string, budget: number): string { +export function buildLogExcerpt(logText: string, budget: number): string { const lines = logText.split('\n') + const CONTEXT_BEFORE = 3 + const CONTEXT_AFTER = 6 - // Rank error lines first (lines containing 'error', 'Error', 'ERROR', 'FAILED', 'failed') - const errorLines: string[] = [] - const otherLines: string[] = [] + // Find error line indices. + const errorIndices: number[] = [] + for (const [i, line] of lines.entries()) { + if (/error|failed/i.test(line ?? '')) errorIndices.push(i) + } - for (const line of lines) { - if (/error|failed/i.test(line)) { - errorLines.push(line) + // No error line found — fall back to the head of the log. + if (errorIndices.length === 0) return lines.join('\n').slice(0, budget) + + // Build merged [start, end] windows around each error line (in original order). + const windows: {start: number; end: number}[] = [] + for (const idx of errorIndices) { + const start = Math.max(0, idx - CONTEXT_BEFORE) + const end = Math.min(lines.length - 1, idx + CONTEXT_AFTER) + const last = windows.at(-1) + if (last !== undefined && start <= last.end + 1) { + // Overlaps or is adjacent to the previous window — merge. + last.end = Math.max(last.end, end) } else { - otherLines.push(line) + windows.push({start, end}) } } - const ranked = [...errorLines, ...otherLines] - const joined = ranked.join('\n') - return joined.slice(0, budget) + // Emit the windows in order, separated by an elision marker when they are non-contiguous. + const segments: string[] = [] + for (const [w, win] of windows.entries()) { + if (win === undefined) continue + if (w > 0) segments.push('...') + segments.push(lines.slice(win.start, win.end + 1).join('\n')) + } + + return segments.join('\n').slice(0, budget) } // --------------------------------------------------------------------------- diff --git a/scripts/capture-learnings-open.test.ts b/scripts/capture-learnings-open.test.ts index 862763805..b8896505b 100644 --- a/scripts/capture-learnings-open.test.ts +++ b/scripts/capture-learnings-open.test.ts @@ -565,7 +565,7 @@ describe('openLearningIssues', () => { expect(counts.failed).toBe(0) }) - it('FIX 7: title does NOT contain "review-heavy" (neutral title for all trigger types)', async () => { + it('title does NOT contain "review-heavy" (neutral title for all trigger types)', async () => { // #given a learning with a known mergeSha const sha = 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef' const createSpy = vi.fn().mockResolvedValue({data: {number: 1}}) @@ -586,10 +586,10 @@ describe('openLearningIssues', () => { }) // --------------------------------------------------------------------------- -// FIX 6: planLearnings blocks bodies containing hard secrets (logDiffHasSecret) +// planLearnings blocks bodies containing hard secrets (logDiffHasSecret) // --------------------------------------------------------------------------- -describe('planLearnings — FIX 6: logDiffHasSecret as final defense-in-depth gate', () => { +describe('planLearnings — logDiffHasSecret as final defense-in-depth gate', () => { it('blocks a body containing a ghp_ secret (logDiffHasSecret gate)', () => { // #given an agent-authored body containing a GitHub PAT const sha = 'abc123def456abc123def456abc123def456abc1' diff --git a/scripts/capture-learnings-privacy.test.ts b/scripts/capture-learnings-privacy.test.ts index 0bbaa687c..f30323355 100644 --- a/scripts/capture-learnings-privacy.test.ts +++ b/scripts/capture-learnings-privacy.test.ts @@ -617,7 +617,7 @@ describe('redactLogDiffSecrets', () => { expect(result).toContain('/endpoint') }) - it('FIX 5: Bearer token with base64 chars (+/) is fully redacted', () => { + it('Bearer token with base64 chars (+/) is fully redacted', () => { // #given a standalone Bearer token containing + and / (valid base64 chars) // Note: using a standalone Bearer (not inside an Authorization header) so the // Bearer pattern fires rather than the Authorization header pattern @@ -634,10 +634,10 @@ describe('redactLogDiffSecrets', () => { }) // --------------------------------------------------------------------------- -// FIX 4: New SECRET_BLOCK_PATTERNS — Google API key, GitLab PAT, JWT, generic creds +// SECRET_BLOCK_PATTERNS — Google API key, GitLab PAT, JWT, generic creds // --------------------------------------------------------------------------- -describe('logDiffHasSecret — FIX 4: new block patterns', () => { +describe('logDiffHasSecret — additional block patterns', () => { it('detects a Google API key (AIza prefix + 35 chars)', () => { // #given a body containing a Google API key shape // Using an obviously-fake key — 'A'.repeat(35) is not a real credential