feat: exact tag-scoped recall via getByIds (#141)#143
Merged
rahilp merged 8 commits intoJun 10, 2026
Merged
Conversation
Adds a cosine similarity function for comparing vector embeddings. BGE embeddings are not normalized, so the denominator matters for keeping tag-path scores on the same scale as Vectorize's cosine scores.
Tag-filtered recall now scores the tag's own vectors directly instead of post-filtering a global top-50 Vectorize query, so tagged memories can no longer be lost to the cap. Untagged recall is unchanged.
…141) The tag path can produce more than 100 candidates (that's the point of the feature), which would exceed D1's per-query bound parameter limit.
Coverage Report
File Coverage
|
||||||||||||||||||||||||||||||||||||||
…141) Production returned VECTOR_GET_ERROR code 40007 ("more than the maximum allowed count of 20") for tags with more than 20 vectors. The 500 batch size was a guess; the real limit is 20.
…rsing MCP text The Recall chat flow parsed the MCP tool's formatted text to build source cards, splitting on anything that looked like a list item — so memories containing bullets or numbered lines were counted as multiple sources (e.g. 25 sources from 5 matches). The UI now calls GET /recall for structured JSON and serializes the results itself for the /chat context.
bffb340 to
c9d7e7f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves #141
Summary
vector_ids→ batchedVECTORIZE.getByIds→ newcosineSimhelper vs the query embedding) instead of post-filtering a global top-50 Vectorize query, so tagged memories can no longer be silently lost to the cap. Untagged recall is unchanged, including the low-score retry.tagFilterIdsin-memory post-filter was removed.cosineSimguards on raw norms (float underflow); the candidate-scoringINquery is chunked atD1_MAX_BOUND_PARAMS = 100(the tag path can produce >100 candidates, exceeding D1's bound-parameter limit);getByIdsbatches capped at 20 (Vectorize rejects more withVECTOR_GET_ERROR40007).GET /recallREST endpoint instead of re-parsing the MCP tool's formatted text, fixing the inflated "sources" count when memory content contains list items. Source cards also gain real entry IDs (working Append/Forget), server errors no longer render as "no results", and the/chatcontext keeps dates/tags/source for temporal questions.Test Plan
test/integration/recall.test.ts: beyond-top-50 surfacing, cosine ranking, stale vector IDs (partial + total), no-vector short-circuit, 20-IDgetByIdsbatching, shared-vector dedupe, chunk parentId dedupe, topK in tag path, >100-candidate D1 chunkingcosineSim; 1 new UI contract test for the REST/recallresponse shapetsc --noEmitclean