fix(rag): drop embedding-not-null filter from kb_has_pair_coverage (parity with #1308)#2213
fix(rag): drop embedding-not-null filter from kb_has_pair_coverage (parity with #1308)#2213Mikecranesync wants to merge 1 commit into
Conversation
…arity with #1308) kb_has_coverage dropped `AND embedding IS NOT NULL` in #1308 because a row reachable only via BM25 is still KB coverage and seeded-but-unembedded rows were invisible (routing Modbus questions to the hallucination fallback). Its sibling kb_has_pair_coverage kept the filter — so a real (vendor, model) pair whose chunks are seeded but not yet embedded is judged "not covered", and the resolver drops the model (UNS_PAIR_DROPPED), degrading the UNS path, the product-name rerank stream, and the citation label for a product the KB *does* cover lexically. Removed the filter (parity with the sibling) and extracted the query to `_KB_PAIR_COVERAGE_SQL` so it is unit-testable offline. Found by the 2026-06-21 retrieval-grounding investigation. Correct-by- construction (matches the #1308 decision the sibling already proved). Offline contract tests assert no embedding filter + still pair/tenant-scoped. Engine/RAG path — run the staging eval before prod deploy. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CS9fxC3gdSUJDJqHw1uMiu
🤖 AI Code ReviewReview by: groq (llama-3.3-70b-versatile) Review🔴 IMPORTANT: Security vulnerabilitiesNo security vulnerabilities were found in the provided diff. However, it's essential to ensure that the 🔴 IMPORTANT: Missing error handling on network/IO operationsNo network/IO operations are present in the provided diff that would require additional error handling. The 🟡 WARNING: Logic bugs or incorrect assumptionsThe removal of the 🟡 WARNING: Missing input validation at API boundariesNo API boundaries are directly affected by the provided diff. However, it's essential to ensure that the 🔵 SUGGESTION: Code quality improvements, naming, maintainabilityThe code is generally well-structured and readable. The extraction of the SQL query into a separate constant ✅ GOOD: Noteworthy good practices foundThe use of parameterized queries (e.g., Generated by the MIRA automated code review pipeline (Groq → Cerebras → Gemini cascade) |
MIRA staging gate — ✅ PASSEngine + NeonDB staging branch + Groq cascade against fixed questions, graded on the 5-dimension rubric in
Rubric: |
Why
From the 2026-06-21 retrieval-grounding investigation (issues #2207–#2212). This is the safe, correct-by-construction finding — the rest are filed for the staging-gated work.
kb_has_coveragedeliberately droppedAND embedding IS NOT NULLin #1308 (a BM25-reachable row is still coverage; seeded-but-unembedded rows were invisible → Modbus questions routed to the hallucination fallback). Its siblingkb_has_pair_coveragekept the filter (neon_recall.py:962). So a real(vendor, model)pair whose chunks are seeded but not yet embedded (the #2083/#2085/#2117 NULL-embedding class) is judged "not covered" → the resolver drops the model (UNS_PAIR_DROPPED), degrading the UNS path, the product-name rerank stream, and the citation label — for a product the KB does cover lexically.What
AND embedding IS NOT NULLfromkb_has_pair_coverage, restoring parity withkb_has_coverage._KB_PAIR_COVERAGE_SQLso it's unit-testable without NeonDB.tests/test_kb_pair_coverage_sql.py— asserts no embedding filter + still pair/tenant-scoped.Verification
pytest tests/test_kb_pair_coverage_sql.py→ 3 passed;ruffclean; module imports. No DB needed (the fix is a contract change, the sibling already proved the behavior in #1308).Deploy gate
Engine/RAG path — run the staging eval (garage_conveyor_field --live, #2202) before prod deploy. Low risk (aligns to an already-proven sibling), but it changes coverage judgments.
Related (filed this session, not in this PR)
#2207 (dead 0.70-floor relaxation), #2208 (chat query poisoning), #2209 (follow-up context loss), #2210 (bot recall hybrid filter, P0), #2211 (extraction precision bundle), #2212 (non-GS10 corpus seeding).
🤖 Generated with Claude Code