klappy · klappy · Apr 24, 2026 · Apr 24, 2026
diff --git a/odd/ledger/journal.md b/odd/ledger/journal.md
@@ -1448,3 +1448,30 @@ The fifth commit on this PR closes the remaining **#5 Low** — `populateEntityI
 
 Final tally for PR #20: **5 Bugbot findings across 4 review cycles**. **4 closed by Cursor Agent autofix** (1 High, 2 Medium, 1 Low). **1 closed by manual fix** (Low). The High finding (undeclared `ctx` in `searchByEntity`) is the class of bug that TypeScript would normally catch — but esbuild's transpile-only pipeline silently compiles free-variable references, so the type system was not the guard here; adversarial review was. The J-003→J-004 meta-lesson repeats: type contract + adversarial review is the durable bar; either one alone is insufficient. When the type system itself is disabled in the build pipeline (as with esbuild transpile-only), adversarial review becomes the only line of defense between compile and runtime, which is precisely the condition that made #1 a High severity.
 
+---
+
+### J-006 — H11b validated in production; the principle holds
+
+**Observation:** PR #20 (H11b) merged at 2026-04-24T02:42:39Z. Workers Builds completed the production redeploy at 02:43Z. Live validation probe at 02:43–02:44Z against `aquifer.klappy.dev/mcp` exercised the cold-first-visit → background-warm → warm-second-visit flow for two known-cold entities (`person:Obadiah`, `person:Onesimus`) plus a spot-check of the previously-warmed `person:Paul`.
+
+First visit to `person:Obadiah` returned in **985 ms** with `partial=True` and the disclosure note `"0/33 indexed, 33 warming"`. The trace header showed every per-resource `entities.json` read returning `miss` — fan-out confirmed no entity indexes existed for the current composite SHA. `ctx.waitUntil(warmEntityIndexesForResources(33 missing resources, ...))` was scheduled. Pre-H11b baseline for the same call (pre-deploy probe at 01:11Z) had returned **HTTP 502 at 11.2 s** — Cloudflare's frontend cut the response stream while the Worker was still running the inline bootstrap scan.
+
+Second visit 20 seconds later returned in **1094 ms** with 78 matches and `partial=True, "28/33 indexed, 5 warming"`. Trace header showed per-resource entity-index reads now returning `cache` hits — the background warm from the first visit had populated 28 of 33 per-resource entity indexes during the 20-second interval. The remaining 5 either take longer to scan (larger content-file counts) or the `ctx.waitUntil` budget on the first visit's invocation expired before those finished. The self-healing property held: next query will re-trigger warms for the remaining 5.
+
+Third visit (`person:Onesimus`, a distinct entity) returned `partial=False` with 73 complete matches in 1347 ms — because the prior warm for Obadiah wrote full per-resource entity maps (every entity association in every article), not Obadiah-only data. Any entity present in the articles of the 28 warmed resources is now servable complete, including ones never queried. This is the intended amortization behavior: the cost of warming one resource benefits every future entity lookup touching that resource.
+
+Workers Logs aggregate query over 02:42Z–02:48Z: 6 invocations, 100% outcome=success, 0 errors, 0 `exceededMemory`, 0 `exceededCpu`, 0 exceptions. Pre-H11b invocations of the same entities produced the 255-subrequest OOM signature (J-002) and/or Cloudflare 502 (pre-deploy probe). Both failure modes absent from post-merge logs.
+
+**Learning:** The operator-stated principle — *"partial data with transparency to come back for more; background fetch warms the cache before you come back"* — works precisely as named. The first visit's 985 ms wall clock is the fan-out latency (33 parallel R2 reads of small blobs, most returning null); that's the floor of what cold-cache entity lookup can cost when no per-resource index exists. Subsequent visits share the warming cost across all entities in touched resources — the amortization is substantially better than per-entity bootstrap caching would have produced, because one warm benefits N entities and not just the queried one. The partial-note disclosure text (`"N/M indexed, K warming"`) is sufficiently concrete that the user has a bounded expectation for when to retry; the second visit's note (`"28/33 indexed, 5 warming"`) proves the mechanism visibly, not just in theory.
+
+The H11→H11b arc is itself a lesson in distinguishing "the corpus scan is the problem" from "the corpus scan being on the user-blocking path is the problem." H11 moved the scan off the user path to the index-build path and still called it solved; the scan itself moved, the user-blocking property did not, it was just hidden by the composite-SHA-staleness cache keeping the H11-era build from ever re-running. H11b removes the user-blocking property from both the build path and the query path by never running the scan in either — it runs only in the background after a partial query has already been served. The principle is not about where expensive work runs. It is about what the user waits for.
+
+**Decision:** Close H11, H11b, J-005 as resolved. The entity lookup path has been proven in production to serve partial-with-disclosure responses immediately and self-heal via background warm. Promote H14 (paired pattern: type contract + adversarial review) to canon — the H11b PR produced five Bugbot findings including one High-severity crash that the type system would have caught if esbuild transpile-only weren't bypassing type checking in the build pipeline; this is the third documented recurrence of the pattern being the deciding reason for a decision and satisfies the canon-graduation test per `klappy://canon/principles/cache-fetches-and-parses` method. Do not delete `bootstrapEntityMatches` / `BootstrapEntityResult` / `formatPartialBootstrapNote` yet — they are now provably unused from user paths but the 24-hour observation window for production stability hasn't elapsed; H16 remains open for a follow-up PR after stability is confirmed.
+
+**Constraint:** The observed 985 ms first-visit wall clock is the floor for un-indexed entity lookups. If future corpus growth pushes fan-out latency past ~3 s (i.e., the R2-read-parallelism ceiling for 62+ resources is worse than 33), the partial-note path will start producing client-side timeouts even though the Worker completes cleanly. Monitor fan-out latency as the 23 unserved repos come online per the multi-language metadata probe work. The Workers Logs subrequest counts for the validation window (63 total across 6 invocations) are lower than expected if every `ctx.waitUntil` warm was firing a full 33-resource scan; either the warms completed earlier and are not in the 5-minute window I queried, or most of the metadata and content fetches hit R2 storage cache from prior sessions, or Cloudflare rolls up subrequest counts in a way that doesn't include `ctx.waitUntil` children of completed invocations. The distinction matters only if fan-out slowness in production points back at the warm not actually running; the second-visit 28/33 cache-hits proves the warm DID run, so this is a measurement-convention question and not a correctness question.
+
+**Handoff:**
+- **H15 closed** — 24-hour observation is the last safety check. Workers Logs outcome distribution remains 100% success post-H11b merge across the 6 invocations in the first ~5 minutes. Continue monitoring; re-query at 24h and 48h for confirmation of durability under organic load.
+- **H16 unblocked** — delete `bootstrapEntityMatches`, `BootstrapEntityResult`, `formatPartialBootstrapNote` (and their tests that still route through them directly) in a standalone PR after 24 hours of clean production observation. They are confirmed dead code from the user path. The direct tests that exercise `bootstrapEntityMatches` via a direct call should be deleted alongside the function itself — they test an internal function that's about to be removed.
+- **H19 promoted from J-005's H17** — the disclosure note text is static ("retry in a few seconds"). Observed behavior: 20-second wait produced 28/33 warmed, so "a few seconds" understates the realistic warm time for full corpus coverage. Consider dynamic wording based on `missing_resources.length` and an estimated seconds-per-resource, or a simpler "retry in N seconds" where N = missing_resources.length. Low-priority UX tuning; not a correctness issue.
+- **H14 promoted** — encode "type contract + adversarial review" as a paired pattern in canon. Three recurrences documented: J-003 (BootstrapEntityResult contract + Bugbot catching implementation drift); J-005 (H11b FanOutEntityResult contract + Bugbot catching the undeclared `ctx` crash); and the generalized form observed across both — when the type system is present at author time but bypassed in the build pipeline (esbuild transpile-only), adversarial review is the sole defense between compile and runtime. Candidate canon path: `klappy://canon/principles/type-contract-plus-adversarial-review`.