Skip to content

test(memory/graph): regression test for entity-graph seed stopword filter#51

Merged
skynetcmd merged 1 commit into
mainfrom
test/graph-seed-stopword-regression
Jun 15, 2026
Merged

test(memory/graph): regression test for entity-graph seed stopword filter#51
skynetcmd merged 1 commit into
mainfrom
test/graph-seed-stopword-regression

Conversation

@skynetcmd

Copy link
Copy Markdown
Owner

Summary

Adds the missing regression test for the entity-graph seed stopword filter (shipped in #13 / dd6f1ed). That fix landed with a pre-registered metric but no test gated it — a §3/§11 (robustness / bench-discipline) behavior-baseline gap per docs/DESIGN_PHILOSOPHIES.md. This closes it.

Why it matters

The fix prevents sentence-initial Title-Case common words ("Can", "What", "How", "I") from being captured as entity-graph seeds, where "Can" LIKE-matched "Canada"/"Canva" and pulled spurious BFS neighbors into the top-k. Measured impact (from #13): −13.3pp → 0pp at session-hit-rate@k=5 on single-session-preference questions, +1.4pp @k=3 overall. Nothing locked that behavior until now.

What the test does

Exercises the real _ENTITY_MENTION_RE + _QUERY_STARTER_STOPWORDS from memory.graph (no DB / embedder), replicating the Step-1 seed-extraction loop, and asserts:

  • common query starters (Can/What/How/I/…) are in the stoplist;
  • "Can you recommend a show like Stranger Things" → seeds keep Stranger Things, drop Can (the canonical false-match regression);
  • a query of only starters yields zero seeds (BFS short-circuits);
  • a genuine Title-Case entity not in the stoplist survives.

Testing

  • 16 tests, pass. No DB, no network — fast (§8).
  • Mutation-verified: emptying the stoplist turns the suite red, so it genuinely gates the fix (§3).
  • Runs clean alongside test_entity_graph.py (52 passed together); ruff clean (§13).

🤖 Generated with Claude Code

…lter

The stopword filter (PR #13 / dd6f1ed) shipped with a pre-registered metric
(-13.3pp -> 0pp at session-hit-rate@k=5 on single-session-preference) but no test
gated it — a §3/§11 behavior-baseline gap. This closes it.

Exercises the real _ENTITY_MENTION_RE + _QUERY_STARTER_STOPWORDS from memory.graph
(no DB/embedder), replicating the Step-1 seed-extraction loop, and asserts:
- common query starters (Can/What/How/I/...) are in the stoplist;
- "Can you recommend a show like Stranger Things" -> seeds keep "Stranger Things",
  drop "Can" (the canonical "Can"->"Canada" false-match regression);
- a query of only starters yields zero seeds;
- a genuine Title-Case entity not in the stoplist survives.

Mutation-verified: emptying the stoplist turns the suite red. 16 tests, no DB,
ruff clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@skynetcmd skynetcmd merged commit af65067 into main Jun 15, 2026
9 checks passed
@skynetcmd skynetcmd deleted the test/graph-seed-stopword-regression branch June 15, 2026 03:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant