Skip to content

feat(entity): H11 — eager per-resource entity index, end user-blocking corpus scan#19

Merged
klappy merged 2 commits intomainfrom
h11-eager-entity-index
Apr 24, 2026
Merged

feat(entity): H11 — eager per-resource entity index, end user-blocking corpus scan#19
klappy merged 2 commits intomainfrom
h11-eager-entity-index

Conversation

@klappy
Copy link
Copy Markdown
Owner

@klappy klappy commented Apr 23, 2026

Why

Closes H11 from J-003. The J-002/J-003 fix bounded the OOM but didn't address the underlying bad pattern: every cold-cache entity lookup was scanning every content file in every resource on the user's blocking path. Post-merge production observation (J-004) confirmed the OOM is dead but the cold-cache lookup wall-clock is ~22s — at the edge of Cloudflare's frontend gateway tolerance, surfacing as occasional 502s for the user even when the Worker itself returned outcome=success.

The canonical fix per the project's pre-built-index principle: stop scanning the corpus at query time. Pre-build per-resource entity indexes at index-build time, exactly the way passage and title indexes are already built.

What

storage.ts — adds entityIndexKey(code, sha) mirroring passageIndexKey/titleIndexKey/articleIndexKey. Stores Array<[entityId, ArticleRef[]]> per resource at index/{code}/{sha}/entities.json.

registry.ts — three additions, no removals:

  • populateEntityIndexes(results, env, storage, repoShas) — invoked from buildIndex after the existing per-resource writes complete. Scans every JSON content file in every resource's scripture_burrito.ingredients, collects article.associations.acai references, writes per-resource entity maps to R2. Memory bounded by ENTITY_BUILD_RESOURCE_CONCURRENCY=4 × ENTITY_BUILD_FILE_CONCURRENCY=8 = max 32 in flight (same caps as J-003's bootstrap, same memory profile).
  • fanOutEntitySearch(entityId, index, storage, tracer) — post-H11 query path. Loads all per-resource entity indexes in parallel, unions matches, backfills resource_type from index.registry. Mirrors fanOutPassageSearch/fanOutTitleSearch exactly.
  • Local copy of settledInChunks helper (registry.ts cannot import from tools.ts; the tradeoff and follow-up are tracked under H12).

tools.ts — both entity callers (handleEntity, searchByEntity) now follow a 3-tier lookup:

  1. In-memory index.entity.get(normalized) (tests, warm bootstrap cache)
  2. fanOutEntitySearch (the new H11 fast path)
  3. bootstrapEntityMatches only if (2) returns empty (defensive backstop for indexes built pre-H11)

bootstrapEntityMatches and BootstrapEntityResult stay intact — both for migration safety and as a permanent fallback.

Performance prediction

Pre-H11 (post-J-003) Post-H11
Warm in-memory cache <1 ms <1 ms (unchanged)
Warm R2 bootstrap cache ~95 ms ~95 ms (unchanged — tier-1 hits this)
Cold-cache entity lookup ~22 s wall-clock for 33-resource scan ~200-500 ms for 33 parallel R2 reads of small entity blobs
Index-build cost unchanged +~22 s on true cold start (the same scan, moved off the user-blocking path)
Background refresh absorbs the index-build cost via ctx.waitUntil (existing mechanism) unchanged

Order-of-magnitude latency improvement on cold-cache entity lookup. No regression on any other path.

Vodka check

  • New helpers (populateEntityIndexes, fanOutEntitySearch) are generic over their data shape — no domain-specific branching by resource_type or content category.
  • entityIndexKey follows the existing per-resource key pattern.
  • Zero new if (resource_type === ...) branches anywhere in the changed code.
  • Server LOC grows ~2.5% net (+~250 lines across registry.ts and storage.ts); the entire entity tool's safety + performance budget for J-002/J-003/H11 has cost ~9% LOC growth — well under the ~88% KB coverage growth this work enables.

Verification (DoD)

  • Change description — 5 files modified: src/storage.ts (+key fn), src/registry.ts (+helpers + builder + fan-out), src/tools.ts (caller updates), src/tools.test.ts (+4 H11 tests), odd/ledger/journal.md (+J-004).
  • Verification performednpm ci && npm run build && npm run test with GITHUB_TOKEN set, mirroring CI build-test.
  • Observed behavior165/165 tests pass (was 161 on main; +4 H11 fan-out tests). wrangler deploy --dry-run clean. typecheck output unchanged from main's pre-existing state — no new errors.
  • Evidence produced — vitest summary (165 passing) captured in commit message; CI will reproduce on merge.
  • Self-audit — Intended outcome: drop cold-cache entity lookup latency by an order of magnitude without changing the warm-path. Constraints applied: vodka (helpers generic, no domain branches); KISS (no global entity blob, just per-resource mirroring of the existing pattern); DoD (5 requirements). Tradeoffs: bootstrapEntityMatches stays as defensive fallback rather than being deleted now, because indexes built pre-H11 still exist in R2 and the migration is graceful. Remaining risks: build-time cost is now ~22s for the cold-start path (was per-lookup); background refresh absorbs this for non-first builds, but the very first cold start after deploy will pay it once. Acceptable.

Post-deploy validation plan

  1. Merge this PR.
  2. Workers Builds redeploys.
  3. Wait for the first cold start (or trigger via getOrBuildIndex) — index build will run populateEntityIndexes and write per-resource entity blobs.
  4. Trigger entity entity_id=person:Paul against an instance that hasn't seen the entity warm. Should respond well under 1s with the full result set (was ~22s pre-H11).
  5. Workers Logs outcome distribution should remain 100% success for entity calls — no clientDisconnected from the user-side timeout, no edge-injected 502s.

J-005 will be encoded after that observation, closing H11 affirmatively (or naming what didn't work as expected and what to do about it).

Mode trail

Investigation-driven session: J-002 (incident) → J-003 (root-cause + bounded-fanout fix) → bug-fix arc (9 Bugbot findings closed) → J-004 (post-merge verification) → H11 (canonical fix this PR). Each step grounded in direct observation of the deployed system; no speculation about behavior we hadn't measured.


Note

Medium Risk
Shifts entity resolution from on-demand corpus scans to index-build-time scanning of all content files, which adds a potentially heavy cold-build cost and new R2 objects; query-path changes are straightforward but could affect completeness/latency if indexes are missing or partially built.

Overview
Entity lookup no longer relies on a user-blocking corpus scan by default. Index builds now precompute and store a per-resource entity index (entity_id → ArticleRef[]) in R2, and entity queries fan out across these small per-resource blobs in parallel.

handleEntity/searchByEntity now use a 3-tier lookup: in-memory index.entity, then fanOutEntitySearch, and only then fall back to the existing bootstrapEntityMatches path for older indexes. Adds bounded-concurrency scanning during buildIndex to populate the entity indexes, a new entityIndexKey in storage, and new tests covering the fan-out path, multi-resource merge, normalization, and bootstrap fallback.

Reviewed by Cursor Bugbot for commit 1a0641a. Bugbot is set up for automated code reviews on this repo. Configure here.

…g corpus scan

Closes H11 from J-003. Refactors entity lookup to use per-resource entity
indexes built once at index-build time, replacing the user-blocking
on-demand corpus scan that produced the J-002 OOM. Bootstrap stays as a
defensive fallback only — it now runs only when no per-resource entity
index has been written for any registry resource (e.g. on indexes built
pre-H11). Once any complete bootstrap result is written and any
subsequent index rebuild populates entity indexes, the fallback path
goes dead.

Architecture:

- New `entityIndexKey(code, sha)` in storage.ts mirrors the existing
  passageIndexKey/titleIndexKey/articleIndexKey pattern: per-resource
  blob at `index/{code}/{sha}/entities.json` storing
  Array<[entityId, ArticleRef[]]>. Each blob is small (typically <100KB)
  because it's just one resource's references; the SHA-keyed lifecycle
  works identically to the other indexes.
- New `populateEntityIndexes(results, env, storage, repoShas)` in
  registry.ts is invoked from `buildIndex` after the existing
  passage/title/article writes complete. It scans every content file in
  every resource's scripture_burrito.ingredients, collects ACAI
  associations, and writes per-resource entity maps to R2. Memory is
  bounded by `ENTITY_BUILD_RESOURCE_CONCURRENCY=4` and
  `ENTITY_BUILD_FILE_CONCURRENCY=8` — same caps as J-003's bootstrap
  fanout, same safety profile.
- New `fanOutEntitySearch(entityId, index, storage, tracer)` in
  registry.ts is the post-H11 query path. Loads all per-resource entity
  indexes in parallel, unions matches, backfills resource_type from
  index.registry on read. Mirrors fanOutPassageSearch/fanOutTitleSearch
  exactly — vodka-consistent.
- handleEntity and searchByEntity now call fanOutEntitySearch first;
  only fall through to bootstrapEntityMatches when fan-out returns
  empty. The bootstrap function and BootstrapEntityResult type stay
  intact for the migration period and as a permanent backstop.

Why per-resource not global:
- Matches the existing passageIndexKey/titleIndexKey/articleIndexKey
  shape — the registry was already designed for this access pattern
  (vodka consistency).
- Each per-resource blob can be invalidated independently when its
  SHA changes; no global rebuild needed when one resource updates.
- Read pattern is N small parallel R2 reads (cheap, memory-bounded)
  vs one large sequential read (subject to R2 object size limits).
- A single global entity blob across the full corpus could grow into
  the multi-MB range; per-resource keeps each blob small enough to
  fit comfortably in Cache API.

Why settledInChunks duplicated in registry.ts:
- registry.ts cannot import from tools.ts (the dependency runs the
  other way). Helper is small enough that the dup isn't worth a
  shared-module refactor; H12 already tracks the audit.

Verification:

- npm ci && npm run build && npm run test (with GITHUB_TOKEN set,
  mirroring CI build-test job)
- 165/165 tests pass (was 161 on main, +4 H11 fan-out tests covering:
  fan-out hit returns matches without invoking bootstrap; fall-through
  to bootstrap when no entity index exists; multi-resource union;
  case-insensitive entity_id normalization)
- wrangler deploy --dry-run: clean, no binding/compatibility-flag
  changes, no new external dependencies

Performance prediction:

Pre-H11 cold-cache entity lookup (post-J-003 fix): ~22s for the full
33-resource bootstrap scan (observed in post-merge production).

Post-H11 cold-cache entity lookup: 33 parallel R2 reads of small
per-resource entity blobs. Conservatively ~200-500ms for the
fan-out, plus the existing index hydration cost. Order of magnitude
improvement on cold-cache; warm-cache (in-memory tier) unchanged.

Index-build cost grows by ~22s on a true cold start (the same scan
the bootstrap was doing per-lookup, now done once per composite SHA).
Background refresh via ctx.waitUntil absorbs this for non-first cold
builds.

Vodka constraint:

- New helpers (`populateEntityIndexes`, `fanOutEntitySearch`) are
  generic over their data shape; no domain-specific branching by
  resource_type or content category.
- `entityIndexKey` follows the existing per-resource key pattern.
- No new `if (resource_type === ...)` branches anywhere in the
  changed code.
- Server LOC grows ~2.5% (+~250 lines net across registry.ts and
  storage.ts); the entire entity tool's safety + performance budget
  for J-002/J-003/H11 has cost ~9% LOC growth — well below the
  ~88% KB-coverage growth this work enables.

Journal:

- Appends J-004 (post-merge verification: J-002/J-003 closed by direct
  observation, 70 invocations / 0 errors / 0 exceededMemory across
  the post-merge window).
- The H11 work itself will get J-005 in a follow-up entry once
  post-deploy production observation confirms the cold-path latency
  drop. Not encoding that prediction as a fact in this journal is
  deliberate per the operator's "claim is a debt" axiom — the
  fan-out latency claim is verifiable only post-deploy.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 23, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
aquifer-mcp 1a0641a Commit Preview URL

Branch Preview URL
Apr 23 2026, 10:08 PM

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Missing language segment in content file fetch URL
    • Added the missing ${language}/ segment before json/${file} in the URL built inside populateEntityIndexes so cold R2 entity-index builds now fetch the correct GitHub raw path.
Preview (1a0641ad81)
diff --git a/odd/ledger/journal.md b/odd/ledger/journal.md
--- a/odd/ledger/journal.md
+++ b/odd/ledger/journal.md
@@ -1384,3 +1384,20 @@
 
 The PR is now `mergeable_state: clean` with all required checks green (`build-test`, `Workers Builds: aquifer-mcp`, `Cursor Bugbot` neutral). Branch tip after this commit; no further action expected from this side.
 
+---
+
+### J-004 — J-002 / J-003 closed by direct observation; OOM mechanism dead in production
+
+**Observation:** PR #18 merged at 2026-04-23T21:23:52Z (`df7320a7e3665cf1a0da612e719dcaeb6b94b380`). Workers Builds redeployed `aquifer-mcp` immediately, completing successfully at 21:24:22Z. Workers Logs aggregate query (`workersInvocationsAdaptive`, scoped to `scriptName="aquifer-mcp"`, time window 21:24Z–21:55Z) shows **70 invocations, 69 outcome=success, 1 outcome=clientDisconnected, 0 outcome=exceededMemory, 0 outcome=exceededCpu, 0 outcome=scriptThrewException, 1486 total subrequests**. The only non-success row is the J-002-replay attempt where this investigation's own client gave up at its 30s timeout while the cold bootstrap was running — which Workers Logs records as `clientDisconnected`, not as a Worker-side failure. Direct probe of `entity entity_id=person:Paul` (the original J-002 victim) returned a complete 1,155-article result in 95ms from the warm R2 bootstrap cache, with the partial-result note correctly absent. The trace header `storage:entity/166fc4af.../person:pau...=9ms(cache)` confirms a complete bootstrap was previously memoized — the cache-write gate (`complete && deduped.length > 0`) functions as designed. The pre-fix incident on 2026-04-22T00:17:02Z showed `outcome=exceededMemory, 1 req, 1 err, 2.82ms CPU, 255 subrequests` for the same call. The post-fix invocations show the same call now completing under the bounded-fanout cap (~32 in flight) and writing a complete result to cache. Same call, same client, same content corpus; only the application code differs.
+
+**Learning:** The transparency machinery developed across PR #18's bug-fix arc functions correctly in production: complete results carry no partial note, and the cache only memoizes `complete=true` results. The Workers Logs ingestion enabled by PR #17 (J-002 H8) is what made this verification possible at all — without per-invocation outcome data, "no exceededMemory in the last 30 minutes" would have been an inference from absence of complaints rather than a directly observed property. The 9-finding bug-fix arc on PR #18 (1 High, 3 Medium, 5 Low; 6 closed by Cursor Agent autofix, 3 by manual fix) demonstrates that even an explicit-data-flow type contract (`BootstrapEntityResult`) is necessary but not sufficient — adversarial review caught implementation bugs that quietly violated the contract the type promised. The composite pattern worth canonizing: **type contract + adversarial review** as the minimum bar for transparency-critical code paths, where the type forces the API boundary to be honest and the review forces the implementation to honor what the type promised.
+
+A new observation surfaced during validation: the cold bootstrap can take ~22s of wall-clock for the full 33-resource scan (per the 22.9s 502 my client received on a cold-cache `entity:Paul` attempt while the Worker was still running the fanout). The Worker recorded the eventual outcome as `success` with 232 subrequests — it completed and returned within its own deadline budget — but Cloudflare's frontend may have cut the response stream for the client before the Worker's response landed. This is at the edge of Cloudflare's gateway request-duration tolerance and is the empirical signal that 20s is borderline and the inline-on-cold-path bootstrap pattern is fragile. The fix is not to lower the deadline (that would just shift partial results from rare to common); it is to remove the user-blocking corpus scan entirely. That is H11.
+
+**Decision:** Close J-002 and J-003 as resolved by direct observation of post-merge production behavior. Begin H11 work in this same session: replace `bootstrapEntityMatches` as a user-blocking corpus scan with eager population of `index.entity` during `buildIndex`, so cold-path entity lookups become O(1) map reads instead of O(N) corpus scans. Defer H12 (audit other fanout sites) and H13 (long-lived CF observability token) to future sessions; neither blocks correctness.
+
+**Constraint:** Workers Logs aggregate retention is 7 days under Cloudflare's current GA terms. The 70 successful invocations cited above are the only direct evidence of the post-fix outcome, and they will roll out of the queryable window on 2026-04-30. If the H11 work needs to compare future behavior against this baseline, the comparison must be made before then. The CF API token used for the GraphQL queries that produced this evidence has scope `Account Analytics: Read` only (no `Workers Observability: Read`, no `Workers Scripts: Edit`); future per-invocation log inspection or live tail will still require either a broader token or maintainer dashboard access. The cold-bootstrap latency observation (~22s for 33 resources) is the new bound that H11 must beat — any H11 implementation that doesn't reduce cold-path entity lookup wall-clock by at least an order of magnitude has not solved the problem.
+
+**Handoff:**
+- **H14** — Encode "type contract + adversarial review" as a paired pattern for transparency-critical code in canon. Founding observation: the BootstrapEntityResult contract was correct, but its first three implementations on PR #18 each violated it on independent axes; only adversarial review (Cursor Bugbot) caught the implementations that the type couldn't. Pattern: when a function's correctness depends on accurately reporting its own incompleteness, the type signature is the first defense and adversarial review is the second; ship neither alone. Lower priority than H11; tracked here so the lesson doesn't get lost.
+- **H11 promoted from J-003** — Begin work this session: refactor `buildIndex` to eagerly populate `index.entity` so `bootstrapEntityMatches` becomes dead code. Acceptance criteria: cold-path `entity` lookup wall-clock drops from ~22s to sub-second (R2 read of pre-built index); `bootstrapEntityMatches` either deleted or reduced to a defensive fallback that runs only when the eager population is empty (pure backstop, never the primary path); `complete=true` stays a real promise (eager population either succeeds or the index is marked stale, never partial-without-disclosure).

diff --git a/src/registry.ts b/src/registry.ts
--- a/src/registry.ts
+++ b/src/registry.ts
@@ -7,7 +7,7 @@
 } from "./types.js";
 import { metadataUrl, fetchJson, fetchRepoSha, fetchOrgRepos } from "./github.js";
 import { isValidIndexReference, rangesOverlap } from "./references.js";
-import { AquiferStorage, indexKey, metadataKey, passageIndexKey, titleIndexKey, articleIndexKey } from "./storage.js";
+import { AquiferStorage, indexKey, metadataKey, passageIndexKey, titleIndexKey, articleIndexKey, entityIndexKey, contentKey } from "./storage.js";
 
 /** Per-resource article lookup: content_id → file location + minimal metadata. */
 export interface ArticleLookupEntry {
@@ -22,6 +22,44 @@
 const SHA_STALE_MS = 15 * 60 * 1000; // 15 minutes
 const INDEX_MEMORY_TTL_MS = 5 * 60 * 1000; // 5 minutes
 
+/**
+ * Concurrency caps for entity index population during buildIndex. Scanning all
+ * content files for ACAI entity references would otherwise blow the Worker
+ * memory budget — the same OOM mechanism (J-002 / J-003) that affected the
+ * old user-blocking bootstrap path. The same caps apply here for the same
+ * reason; this code runs at index-build time instead of per-query, which
+ * removes the user-visible latency but does not change the per-fetch memory
+ * cost.
+ */
+const ENTITY_BUILD_RESOURCE_CONCURRENCY = 4;
+const ENTITY_BUILD_FILE_CONCURRENCY = 8;
+
+/**
+ * Run `fn` over `items` in batches of `chunkSize`, awaiting each batch to
+ * settle before starting the next. Same shape as Promise.allSettled but with
+ * memory usage bounded by the chunk size rather than the total item count.
+ *
+ * Duplicated from tools.ts intentionally: registry.ts cannot import from
+ * tools.ts (tools.ts depends on registry.ts), and the helper is small enough
+ * that a single canonical source isn't worth the dependency-graph contortion.
+ * See odd/ledger/journal.md J-005 for H12 — both call sites should eventually
+ * collapse onto a shared helper module if this duplication ever grows.
+ */
+async function settledInChunks<T, R>(
+  items: readonly T[],
+  chunkSize: number,
+  fn: (item: T, index: number) => Promise<R>,
+): Promise<PromiseSettledResult<R>[]> {
+  if (chunkSize <= 0) throw new Error("settledInChunks: chunkSize must be > 0");
+  const results: PromiseSettledResult<R>[] = [];
+  for (let i = 0; i < items.length; i += chunkSize) {
+    const batch = items.slice(i, i + chunkSize);
+    const settled = await Promise.allSettled(batch.map((item, j) => fn(item, i + j)));
+    for (const r of settled) results.push(r);
+  }
+  return results;
+}
+
 /** Module-level memory cache — survives across requests within the same isolate. */
 let cachedIndex: NavigabilityIndex | null = null;
 let indexFetchedAt = 0;
@@ -305,9 +343,20 @@
     }
   }
 
-  // Write all per-resource indexes to R2 in parallel
+  // Write all per-resource indexes (passage/title/article) to R2 in parallel
   await Promise.allSettled(writePromises);
 
+  // H11: populate per-resource entity indexes. This scans every content file
+  // for ACAI entity references and writes one entityIndexKey per resource.
+  // It does the SAME work the pre-H11 bootstrap was doing on every cold-cache
+  // entity lookup — moved here to index-build time so user-facing entity
+  // queries become O(N_resources) parallel R2 reads (~30 small parallel
+  // requests, each <100KB) instead of O(N_files) blocking R2 reads gated by
+  // the per-isolate memory budget. The bounded fanout caps keep the per-fetch
+  // memory profile identical to bootstrap; only the latency cost moves off
+  // the user-blocking path.
+  await populateEntityIndexes(results, env, storage, repoShas);
+
   // Return lightweight index — passage/title/entity are empty.
   // Queries use fan-out functions to load per-resource indexes on demand.
   return {
@@ -321,6 +370,144 @@
   };
 }
 
+/**
+ * H11: Build per-resource entity indexes by scanning content files. For each
+ * resource, walks every JSON content file in scripture_burrito.ingredients,
+ * collects every (entity_id → ArticleRef[]) mapping found in
+ * `article.associations.acai`, and writes the resulting Map to R2 keyed by
+ * entityIndexKey(code, sha). Memory is bounded by ENTITY_BUILD_RESOURCE_*
+ * and ENTITY_BUILD_FILE_CONCURRENCY caps using settledInChunks, mirroring
+ * the pre-H11 bootstrap path's safety profile.
+ *
+ * Failure handling: per-file failures are swallowed (the file's entities
+ * just don't appear in the index for this build). Per-resource failures
+ * mean the resource has no entityIndexKey written; fanOutEntitySearch will
+ * see a miss for that resource on the next query, which is the correct
+ * truthful-degradation behavior. The next index rebuild gets another shot.
+ *
+ * Performance: this adds a one-time cost to cold index builds. The pre-H11
+ * bootstrap was paying this cost per-entity-lookup; H11 pays it once and
+ * memoizes for the life of the composite SHA. Background refresh
+ * (refreshAndUpdateCurrentIndex via ctx.waitUntil) absorbs the cost away
+ * from user-visible latency for non-first cold builds.
+ */
+async function populateEntityIndexes(
+  results: PromiseSettledResult<{ code: string; metadata: ResourceMetadata } | null>[],
+  env: Env,
+  storage: AquiferStorage,
+  repoShas: Map<string, string>,
+): Promise<void> {
+  const resources: Array<{ code: string; language: string; files: string[]; sha: string }> = [];
+  for (const result of results) {
+    if (result.status !== "fulfilled" || !result.value) continue;
+    const { code, metadata } = result.value;
+    const sha = repoShas.get(code);
+    if (!sha) continue;
+    const ingredients = Object.keys(metadata.scripture_burrito?.ingredients ?? {});
+    const files = ingredients
+      .filter((k) => k.startsWith("json/") && k.endsWith(".content.json"))
+      .map((k) => k.replace(/^json\//, ""))
+      .sort();
+    if (files.length === 0) continue;
+    resources.push({ code, language: metadata.resource_metadata.language, files, sha });
+  }
+
+  await settledInChunks(resources, ENTITY_BUILD_RESOURCE_CONCURRENCY, async ({ code, language, files, sha }) => {
+    const entityMap = new Map<string, ArticleRef[]>();
+
+    await settledInChunks(files, ENTITY_BUILD_FILE_CONCURRENCY, async (file) => {
+      const url = `https://raw.githubusercontent.com/${env.AQUIFER_ORG}/${code}/${sha}/${language}/json/${file}`;
+      const key = contentKey(code, sha, language, file);
+      let articles: import("./types.js").ArticleContent[] | null = null;
+      try {
+        articles = await fetchJson<import("./types.js").ArticleContent[]>(url, storage, key);
+      } catch {
+        return; // per-file failure — swallow, see comment above
+      }
+      if (!articles?.length) return;
+      for (const article of articles) {
+        const acaiAssociations = article.associations?.acai ?? [];
+        for (const a of acaiAssociations) {
+          const entityId = String(a.id || "").toLowerCase();
+          if (!entityId) continue;
+          const ref: ArticleRef = {
+            resource_code: code,
+            language: article.language || language,
+            content_id: String(article.content_id),
+            title: article.title || `Article ${article.content_id}`,
+            resource_type: "",
+            index_reference: article.index_reference,
+          };
+          const existing = entityMap.get(entityId);
+          if (existing) {
+            existing.push(ref);
+          } else {
+            entityMap.set(entityId, [ref]);
+          }
+        }
+      }
+    });
+
+    if (entityMap.size > 0) {
+      // Serialize Map → array of [entityId, ArticleRef[]] entries for JSON.
+      await storage.putJSON(entityIndexKey(code, sha), Array.from(entityMap.entries()));
+    }
+  });
+}
+
+/**
+ * H11: load all per-resource entity indexes in parallel and union-merge any
+ * matches for the requested entityId. This is the post-H11 hot path for
+ * entity lookup — replaces the pre-H11 user-blocking bootstrap scan with N
+ * parallel R2 reads of small per-resource entity blobs (typically <100KB
+ * each). Resource-types are filled in from index.registry on read, since the
+ * stored per-resource entity index doesn't carry that metadata.
+ */
+export async function fanOutEntitySearch(
+  entityId: string,
+  index: NavigabilityIndex,
+  storage: AquiferStorage,
+  tracer?: RequestTracer,
+): Promise<ArticleRef[]> {
+  const normalized = entityId.toLowerCase();
+
+  // If entity data is already in memory (tests provide this), use it directly.
+  const memHit = index.entity.get(normalized);
+  if (memHit?.length) return memHit;
+
+  const fanStart = performance.now();
+  let hits = 0;
+  let misses = 0;
+
+  const results = await Promise.allSettled(
+    index.registry.map(async (entry) => {
+      const sha = index.repo_shas.get(entry.resource_code);
+      if (!sha) { misses++; return []; }
+      const key = entityIndexKey(entry.resource_code, sha);
+      const { data } = await storage.getJSON<Array<[string, ArticleRef[]]>>(key, tracer);
+      if (!data) { misses++; return []; }
+      hits++;
+      // Find this entityId in the per-resource entity map (entries form).
+      for (const [eid, refs] of data) {
+        if (eid === normalized) {
+          // Backfill resource_type from registry — per-resource index doesn't store it.
+          return refs.map((r) => ({ ...r, resource_type: entry.resource_type }));
+        }
+      }
+      return [];
+    }),
+  );
+
+  tracer?.addSpan("fanout-entities", Math.round(performance.now() - fanStart), undefined,
+    `${index.registry.length} resources, ${hits} hits, ${misses} misses`);
+
+  const matches: ArticleRef[] = [];
+  for (const r of results) {
+    if (r.status === "fulfilled") matches.push(...r.value);
+  }
+  return matches;
+}
+
 // --- Fan-out query functions ---
 
 /**

diff --git a/src/storage.ts b/src/storage.ts
--- a/src/storage.ts
+++ b/src/storage.ts
@@ -175,3 +175,20 @@
 export function articleIndexKey(resourceCode: string, sha: string): string {
   return `index/${resourceCode}/${sha}/articles.json`;
 }
+
+/**
+ * Per-resource entity index. Maps lowercase entity_id (e.g. "person:paul") to
+ * the ArticleRefs in this resource that reference that entity. Built once at
+ * index-build time by scanning the resource's content files; queried via
+ * fanOutEntitySearch which loads all per-resource entity indexes in parallel.
+ *
+ * Why per-resource and not a single global blob: keeps each R2 object small
+ * (typically <100KB), makes the SHA-keyed lifecycle work the same as the
+ * other indexes, and matches the established passageIndexKey/titleIndexKey
+ * pattern. The fan-out at query time is N small reads in parallel, which is
+ * fast and memory-bounded — the opposite of the bootstrap path's pre-H11
+ * behavior of scanning every content file on every cold entity lookup.
+ */
+export function entityIndexKey(resourceCode: string, sha: string): string {
+  return `index/${resourceCode}/${sha}/entities.json`;
+}

diff --git a/src/tools.test.ts b/src/tools.test.ts
--- a/src/tools.test.ts
+++ b/src/tools.test.ts
@@ -1299,3 +1299,167 @@
     expect(text).not.toContain("failed");
   });
 });
+
+
+describe("H11 — fanOutEntitySearch eager entity index", () => {
+  // These tests verify the post-H11 behavior: when per-resource entity indexes
+  // exist in storage (built at index-build time), entity lookups return data
+  // from those small per-resource blobs in parallel WITHOUT scanning content
+  // files at query time. Bootstrap remains as a defensive fallback only.
+
+  beforeEach(() => {
+    vi.clearAllMocks();
+    mockGetOrBuildIndex.mockReset();
+    mockFetchJson.mockReset();
+  });
+
+  it("returns matches from per-resource entity index without bootstrap scan", async () => {
+    // Arrange: storage pre-seeded with a per-resource entity index for
+    // STUDY_NOTES_ENTRY containing person:Paul; no metadata fetches
+    // configured (mockFetchJson would throw if anything tries them).
+    const env: Env = {
+      AQUIFER_CACHE: createMockKV(),
+      AQUIFER_CONTENT: {} as R2Bucket,
+      AQUIFER_ORG: "BibleAquifer",
+      DOCS_REPO: "docs",
+      WORKER_ENV: "production",
+    };
+    const storage = createMockStorage();
+    const idx = buildMockIndex([STUDY_NOTES_ENTRY]);
+    // Clear in-memory entity map so the fan-out path is the one under test
+    // (default buildMockIndex pre-seeds it with a fixture for tier-1 tests).
+    idx.entity.clear();
+    mockGetOrBuildIndex.mockResolvedValue(idx);
+
+    // Pre-seed the per-resource entity index in storage. Format matches what
+    // populateEntityIndexes writes: array of [entityId, ArticleRef[]] entries.
+    const studyNotesSha = idx.repo_shas.get(STUDY_NOTES_ENTRY.resource_code)!;
+    const entityIndexKey = `index/${STUDY_NOTES_ENTRY.resource_code}/${studyNotesSha}/entities.json`;
+    const seededRefs: ArticleRef[] = [{
+      resource_code: STUDY_NOTES_ENTRY.resource_code,
+      language: "eng",
+      content_id: "9640",
+      title: "Acts 7:58",
+      resource_type: "",  // Backfilled by fanOutEntitySearch from registry
+      index_reference: "ACT 7:58",
+    }];
+    await storage.putJSON(entityIndexKey, [["person:paul", seededRefs]]);
+
+    // mockFetchJson must NOT be called — if it is, that means the bootstrap
+    // path (which scans content files via fetchJson) ran when it shouldn't.
+    mockFetchJson.mockImplementation(() => {
+      throw new Error("UNEXPECTED: bootstrap fetched content when fanout should have served");
+    });
+
+    const result = await handleEntity({ entity_id: "person:Paul" }, env, storage);
+    const text = result.content[0]!.text;
+
+    expect(text).toContain("Found 1 article(s)");
+    expect(text).toContain("Acts 7:58");
+    // resource_type should be filled in from registry
+    expect(text).toContain("Study Notes");
+    // No partial note — fan-out path doesn't produce them
+    expect(text).not.toContain("Partial result");
+    // Bootstrap should NOT have been invoked
+    expect(mockFetchJson).not.toHaveBeenCalled();
+  });
+
+  it("falls through to bootstrap when no per-resource entity index exists", async () => {
+    // Defensive fallback: if no per-resource entity index has been written to
+    // storage (e.g. index built pre-H11), fan-out returns empty and the
+    // bootstrap kicks in. This is the migration-safety path; once any
+    // bootstrap result is cached, fan-out will start finding it on next
+    // index rebuild.
+    const env: Env = {
+      AQUIFER_CACHE: createMockKV(),
+      AQUIFER_CONTENT: {} as R2Bucket,
+      AQUIFER_ORG: "BibleAquifer",
+      DOCS_REPO: "docs",
+      WORKER_ENV: "production",
+    };
+    const storage = createMockStorage();
+    mockGetOrBuildIndex.mockResolvedValue(buildMockIndex([STUDY_NOTES_ENTRY]));
+    // No entity index pre-seeded, mockFetchJson returns null for everything
+    // → bootstrap walks the empty corpus, returns complete=true with no matches.
+    mockFetchJson.mockResolvedValue(null);
+
+    const result = await handleEntity({ entity_id: "person:Whoever" }, env, storage);
+    const text = result.content[0]!.text;
+    expect(text).toContain("No articles found");
+    // Bootstrap WAS invoked (because fan-out returned empty), and that's
+    // signaled by mockFetchJson being called at least once for the metadata.
+    expect(mockFetchJson).toHaveBeenCalled();
+  });
+
+  it("merges results from multiple per-resource entity indexes", async () => {
+    // Arrange: TWO per-resource entity indexes both contain entries for
+    // person:paul. Fan-out should union them and return all refs.
+    const env: Env = {
+      AQUIFER_CACHE: createMockKV(),
+      AQUIFER_CONTENT: {} as R2Bucket,
+      AQUIFER_ORG: "BibleAquifer",
+      DOCS_REPO: "docs",
+      WORKER_ENV: "production",
+    };
+    const storage = createMockStorage();
+    const idx = buildMockIndex([STUDY_NOTES_ENTRY, FIA_MAPS_ENTRY]);
+    idx.entity.clear();  // Force fan-out path
+    mockGetOrBuildIndex.mockResolvedValue(idx);
+
+    const sha1 = idx.repo_shas.get(STUDY_NOTES_ENTRY.resource_code)!;
+    const sha2 = idx.repo_shas.get(FIA_MAPS_ENTRY.resource_code)!;
+    await storage.putJSON(`index/${STUDY_NOTES_ENTRY.resource_code}/${sha1}/entities.json`, [
+      ["person:paul", [{
+        resource_code: STUDY_NOTES_ENTRY.resource_code, language: "eng",
+        content_id: "9640", title: "Acts 7:58", resource_type: "", index_reference: "ACT 7:58",
+      }]],
+    ]);
+    await storage.putJSON(`index/${FIA_MAPS_ENTRY.resource_code}/${sha2}/entities.json`, [
+      ["person:paul", [{
+        resource_code: FIA_MAPS_ENTRY.resource_code, language: "eng",
+        content_id: "500001", title: "Paul's Missionary Journeys", resource_type: "", index_reference: "",
+      }]],
+    ]);
+    mockFetchJson.mockImplementation(() => { throw new Error("should not be called"); });
+
+    const result = await handleEntity({ entity_id: "person:Paul" }, env, storage);
+    const text = result.content[0]!.text;
+    expect(text).toContain("Found 2 article(s)");
+    expect(text).toContain("Acts 7:58");
+    expect(text).toContain("Paul's Missionary Journeys");
+    // Both resource_types backfilled from registry
+    expect(text).toContain("Study Notes");
+    expect(text).toContain("Maps");
+  });
+
+  it("normalizes entity_id case before lookup", async () => {
+    // Per-resource entity indexes store entityIds lowercase; the fan-out
+    // function must normalize the query the same way so case differences
+    // don't produce false misses.
+    const env: Env = {
+      AQUIFER_CACHE: createMockKV(),
+      AQUIFER_CONTENT: {} as R2Bucket,
+      AQUIFER_ORG: "BibleAquifer",
+      DOCS_REPO: "docs",
+      WORKER_ENV: "production",
+    };
+    const storage = createMockStorage();
+    const idx = buildMockIndex([STUDY_NOTES_ENTRY]);
+    idx.entity.clear();  // Force fan-out path
+    mockGetOrBuildIndex.mockResolvedValue(idx);
+    const sha = idx.repo_shas.get(STUDY_NOTES_ENTRY.resource_code)!;
+    await storage.putJSON(`index/${STUDY_NOTES_ENTRY.resource_code}/${sha}/entities.json`, [
+      ["person:paul", [{
+        resource_code: STUDY_NOTES_ENTRY.resource_code, language: "eng",
+        content_id: "9640", title: "Acts 7:58", resource_type: "", index_reference: "ACT 7:58",
+      }]],
+    ]);
+    mockFetchJson.mockImplementation(() => { throw new Error("should not be called"); });
+
+    // Query with mixed case
+    const result = await handleEntity({ entity_id: "PERSON:Paul" }, env, storage);
+    const text = result.content[0]!.text;
+    expect(text).toContain("Found 1 article(s)");
+    expect(text).toContain("Acts 7:58");
+  });
+});

diff --git a/src/tools.ts b/src/tools.ts
--- a/src/tools.ts
+++ b/src/tools.ts
@@ -1,7 +1,7 @@
 import type { Env, ArticleRef, ArticleContent, NavigabilityIndex, ResourceEntry, ResourceMetadata } from "./types.js";
 import { parseReference, rangesOverlap, rangeToReadable, isValidIndexReference, bbcccvvvToReadable } from "./references.js";
 import { contentUrl, metadataUrl, fetchJson, GC_TTL } from "./github.js";
-import { getOrBuildIndex, fanOutPassageSearch, fanOutTitleSearch, loadArticleLookup, type ArticleLookupEntry } from "./registry.js";
+import { getOrBuildIndex, fanOutPassageSearch, fanOutTitleSearch, fanOutEntitySearch, loadArticleLookup, type ArticleLookupEntry } from "./registry.js";
 import { getPublicTelemetrySnapshot } from "./telemetry.js";
 import { AquiferStorage, contentKey, metadataKey, catalogKey, entityKey } from "./storage.js";
 import type { RequestTracer } from "./tracing.js";
@@ -420,10 +420,19 @@
 
   let partialNote = "";
   if (matches.length === 0) {
-    const bootstrap = await bootstrapEntityMatches(normalized, index, env, storage, tracer);
-    matches.push(...bootstrap.matches);
-    if (!bootstrap.complete) {
-      partialNote = formatPartialBootstrapNote(bootstrap);
+    // H11: fan out to per-resource entity indexes first (fast, parallel).
+    const fanned = await fanOutEntitySearch(normalized, index, storage, tracer);
+    matches.push(...fanned);
+    if (matches.length === 0) {
+      // Defensive fallback: if no per-resource entity indexes have been
+      // populated yet (e.g. the index pre-dates H11 deploy), fall back to
+      // the on-demand bootstrap. Once any complete bootstrap result has
+      // been cached, future entity lookups skip this path entirely.
+      const bootstrap = await bootstrapEntityMatches(normalized, index, env, storage, tracer);
+      matches.push(...bootstrap.matches);
+      if (!bootstrap.complete) {
+        partialNote = formatPartialBootstrapNote(bootstrap);
+      }
     }
   }
 
@@ -1353,13 +1362,18 @@
   const index = await getOrBuildIndex(env, storage, ctx, tracer);
   const normalized = entityId.toLowerCase();
 
-  // Find all articles referencing this entity. Hot path: pre-built index map.
-  // Cold path: scan the article corpus via bootstrapEntityMatches, which may
-  // return PARTIAL results under wall-clock pressure — surface that to the
-  // user so they can decide whether to retry.
+  // Find all articles referencing this entity. Three-tier lookup:
+  //   (1) in-memory index.entity map (tests + warm bootstrap cache)
+  //   (2) H11: fan out to per-resource entity indexes — fast, parallel
+  //   (3) Defensive fallback: on-demand bootstrap if (2) returns empty
+  //       (e.g. for indexes built pre-H11). Once any complete bootstrap has
+  //       cached its result, future entity lookups skip this fallback.
   let refs = index.entity.get(normalized);
   let partialNote = "";
   if (!refs?.length) {
+    refs = await fanOutEntitySearch(normalized, index, storage, tracer);
+  }
+  if (!refs?.length) {
     const bootstrap = await bootstrapEntityMatches(normalized, index, env, storage, tracer);
     refs = bootstrap.matches;
     if (!bootstrap.complete) {

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 1d971a2. Configure here.

Comment thread src/registry.ts Outdated
@klappy klappy merged commit ebe281e into main Apr 24, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants