K66 context ready ingestion by ryanjosebrosas · Pull Request #8 · ryanjosebrosas/secondbrain-engine

ryanjosebrosas · 2026-03-25T01:52:08Z

No description provided.

- add explicit lineage/grouping/provenance fields to canonical storage - persist retrieval-oriented defaults and richer vector payload identity - prove packet assembly can consume persisted grouping hints

- preserve custom packet grouping on partial re-ingest - keep grouped packets source-linked and graph-safe - align k66 artifact wording and verification state

- include sibling supporting chunks for packet-grounded matches - preserve packet-first grounded output during context expansion

- fuse retrieval lanes before final packet ordering - add provider-neutral reranker seam with no-op fallback

- lock rerank score assertions for packet ordering - cover deterministic reranker fallback during retrieval

- keep full packet pool until fused shortlist selection - retain fallback vector citations after chunk reordering

- rank omitted rerank packets below explicit results - keep fusion lane ordering lane-local - add regression coverage for both cases

- merge latest retrieval/runtime foundation changes - preserve PR4 reranking and packet-context fixes - verify merged branch with typecheck, lint, and tests

feat(bead-6il): contextual retrieval and packet reranking

coderabbitai · 2026-03-25T01:52:14Z

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Free

Run ID: 087414ec-3190-4e3a-b98b-9c9c5a825c99

📥 Commits

Reviewing files that changed from the base of the PR and between 56fe6c0 and 14437d7.

⛔ Files ignored due to path filters (1)

.beads/verify.log is excluded by !**/*.log

📒 Files selected for processing (15)

.beads/artifacts/second-brain-engine-k66/plan.md
.beads/artifacts/second-brain-engine-k66/prd.json
.beads/artifacts/second-brain-engine-k66/prd.md
.beads/artifacts/second-brain-engine-k66/progress.txt
.beads/artifacts/second-brain-engine-k66/research.md
.beads/issues.jsonl
src/index.ts
src/ingestion/service.ts
src/retrieval/service.ts
src/subsystems/reranker/port.ts
src/subsystems/supabase/repository.ts
src/subsystems/supabase/schema.ts
test/ingestion.relational.test.mjs
test/retrieval.hybrid.test.mjs
test/retrieval.packet.test.mjs

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added reranker subsystem support for improved retrieval ranking
- Enhanced packet grouping with persisted metadata for better organization
- Improved citations with item identity and provenance tracking
Improvements
- Enriched vector metadata with additional context fields
- Extended source and item attributes for hierarchy and grouping support
- Added fused and reranked scoring for context packets

Walkthrough

The changes implement the "second-brain-engine-k66" feature, enriching ingestion persistence with retrieval-oriented metadata. Schema extensions add optional hierarchy/grouping fields (rootSourceId, parentSourceId, sourceGroupKey) for sources and tracking fields (ordinal, parentItemId, packetKey, sectionKey, provenanceLocation) for items. Ingestion and retrieval pipelines are updated to persist and consume these fields, with new vector metadata enrichment (occurredAt, packetKey). A new reranker subsystem port is introduced, integrated into the retrieval pipeline with packet-level reranking and fusion scoring. Tests verify persisted structure, packet assembly behavior, and retrieval grounding.

Changes

Cohort / File(s)	Summary
Planning & Documentation `.beads/artifacts/second-brain-engine-k66/plan.md`, `.beads/artifacts/second-brain-engine-k66/prd.json`, `.beads/artifacts/second-brain-engine-k66/prd.md`, `.beads/artifacts/second-brain-engine-k66/progress.txt`, `.beads/artifacts/second-brain-engine-k66/research.md`, `.beads/issues.jsonl`	Added comprehensive bead documentation including implementation plan, PRD (JSON and Markdown), research guidance, and progress log. Closed issue k66 with "Shipped" status after all tasks passed verification.
Schema & Persistence Layer `src/subsystems/supabase/schema.ts`, `src/subsystems/supabase/repository.ts`	Extended Supabase schema to add optional retrieval-oriented fields: sources gain `rootSourceId`, `parentSourceId`, `sourceGroupKey`; items gain `ordinal`, `parentItemId`, `packetKey`, `sectionKey`, `provenanceLocation`. Updated repository upsert to compute and persist these fields with sensible defaults based on existing records and provided values.
Ingestion Model `src/ingestion/service.ts`	Added optional properties to `IngestionSource` (rootSourceId, parentSourceId, sourceGroupKey) and `IngestionItem` (ordinal, parentItemId, packetKey, sectionKey, provenanceLocation) to carry retrieval hints through ingestion pipeline.
Reranker Subsystem `src/subsystems/reranker/port.ts`	Introduced new `RerankerPort` abstraction with async `rerank` method accepting queryText, packets, and limit; includes `createNoopReranker()` for deterministic fallback scoring and slicing.
Retrieval Core Logic `src/index.ts`, `src/retrieval/service.ts`	Substantially refactored retrieval pipeline: vector metadata now includes `occurredAt` and `packetKey`; added packet-level reranking with fallback to noop reranker; introduced packet-by-key aggregation replacing packet-by-source; added vector-item hydration matching and candidate merging; extended score tracking with `fused` and `reranked` fields; updated `RetrievalCitation` to include optional `itemId`. Multiple new helper functions for packet key generation, reranking, fusion scoring, citation deduplication, and candidate merging.
Ingestion Tests `test/ingestion.relational.test.mjs`	Added schema validation assertions for new retrieval fields in source and item records; updated vector metadata expectations to include `occurredAt` and `packetKey`; added tests verifying persisted defaults for source hierarchy and item ordinals; added test confirming `packetKey` stability across re-ingestion.
Retrieval Tests `test/retrieval.hybrid.test.mjs`, `test/retrieval.packet.test.mjs`	Substantially rewrote retrieval test expectations to reflect reranking integration, packet-key-based grouping, and grounding behavior; added comprehensive packet-level tests verifying vector-match grounding, packet expansion to sibling chunks, relational score propagation, and vector-to-relational candidate merging. Updated mock returns to use new packet/item identity fields.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 The persisted whispers now hold their ground,
With packets grouped by keys so sound,
Retrieval hints in every store—
No more shall context need recompute's chore!
From schema to cite, the memories align,
Second-brain enriched: a blessing divine! 🧠✨

Note

🎁 Summarized by CodeRabbit Free

Your organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector · 2026-03-25T01:52:15Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

ryanjosebrosas · 2026-03-25T20:37:12Z

@copilot review

…tlist size Co-authored-by: ryanjosebrosas <178813774+ryanjosebrosas@users.noreply.github.com> Agent-Logs-Url: https://github.com/ryanjosebrosas/secondbrain-engine/sessions/3631daec-2464-4b06-9437-56276b1aaaf9

Copilot · 2026-03-25T20:43:30Z

@copilot review

Reviewed and addressed three issues in 3a6255f:

Merged double-fetch (hydrateVectorMatchedItems + hydrateVectorItemsByDocumentId) — both functions were independently calling getPersistedItems for the same set of sourceIds on every retrieval request. Combined into a single hydrateVectorData pass that builds both outputs (matchedItems and itemsByDocumentId) in one loop, halving the store round-trips.

Removed dead ?? candidate.entityId fallback — GraphContextCandidate.sourceId is string (required) in the port contract, making the entityId fallback unreachable. Removed in both buildContextPackets and collectExpansionSourceIds.

Simplified Math.max(input.limit * 3, input.limit) → input.limit * 3 — limit is always > 0 at this call site (guarded by the limit === 0 ? [] : ... branches above), so Math.max was unreachable.

Everything else looks solid — the three-lane hybrid pipeline (relational + graph + vector), RRF fusion pre-reranking, pluggable RerankerPort with graceful fallback, workspace access guard before any I/O, and the ContextPacket abstraction are all well-structured. All 56 tests pass, typecheck and lint clean.

ryanjosebrosas and others added 11 commits March 23, 2026 09:02

feat(bead-k66): enrich ingestion for retrieval readiness

8253181

- add explicit lineage/grouping/provenance fields to canonical storage - persist retrieval-oriented defaults and richer vector payload identity - prove packet assembly can consume persisted grouping hints

fix(bead-k66): address stacked review and PR findings

ce16afa

- preserve custom packet grouping on partial re-ingest - keep grouped packets source-linked and graph-safe - align k66 artifact wording and verification state

feat(bead-6il): expand packet-local retrieval context

9822ea7

- include sibling supporting chunks for packet-grounded matches - preserve packet-first grounded output during context expansion

feat(bead-6il): add packet reranking over hybrid recall

5acbda2

- fuse retrieval lanes before final packet ordering - add provider-neutral reranker seam with no-op fallback

test(bead-6il): verify contextual retrieval quality end to end

bdca577

- lock rerank score assertions for packet ordering - cover deterministic reranker fallback during retrieval

fix(bead-6il): preserve rerank candidates and vector grounding

b6a978c

- keep full packet pool until fused shortlist selection - retain fallback vector citations after chunk reordering

fix(pr-3): tighten packet scoring semantics

ba9d362

merge(pr-3): sync retrieval foundation base

04a65f6

fix(bead-6il): address rerank review findings

275b148

- rank omitted rerank packets below explicit results - keep fusion lane ordering lane-local - add regression coverage for both cases

merge(pr-4): sync contextual retrieval branch with base

72e8540

- merge latest retrieval/runtime foundation changes - preserve PR4 reranking and packet-context fixes - verify merged branch with typecheck, lint, and tests

Merge pull request #4 from ryanjosebrosas/6il-contextual-retrieval-pr

14437d7

feat(bead-6il): contextual retrieval and packet reranking

Copilot started work on behalf of ryanjosebrosas March 25, 2026 20:37 View session

refactor(retrieval): fix dead code, merge double-fetch, simplify shor…

3a6255f

…tlist size Co-authored-by: ryanjosebrosas <178813774+ryanjosebrosas@users.noreply.github.com> Agent-Logs-Url: https://github.com/ryanjosebrosas/secondbrain-engine/sessions/3631daec-2464-4b06-9437-56276b1aaaf9

Copilot AI review requested due to automatic review settings March 25, 2026 20:43

ryanjosebrosas review requested due to automatic review settings March 25, 2026 20:43

Copilot finished work on behalf of ryanjosebrosas March 25, 2026 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K66 context ready ingestion#8

K66 context ready ingestion#8
ryanjosebrosas wants to merge 12 commits intomainfrom
k66-context-ready-ingestion

ryanjosebrosas commented Mar 25, 2026

Uh oh!

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

chatgpt-codex-connector bot commented Mar 25, 2026

Uh oh!

ryanjosebrosas commented Mar 25, 2026

Uh oh!

Copilot AI commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ryanjosebrosas commented Mar 25, 2026

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

chatgpt-codex-connector bot commented Mar 25, 2026

Uh oh!

ryanjosebrosas commented Mar 25, 2026

Uh oh!

Copilot AI commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 25, 2026 •

edited

Loading