[codex] security(rag): harden tenant isolation and prompt boundaries#2295
Conversation
🤖 AI Code ReviewReview by: groq (llama-3.3-70b-versatile) Review of [codex] security(rag): harden tenant isolation and prompt boundaries🔴 IMPORTANT: Security Vulnerabilities
🔴 IMPORTANT: Missing Error Handling
🟡 WARNING: Logic Bugs or Incorrect Assumptions
🟡 WARNING: Missing Input Validation at API Boundaries
🔵 SUGGESTION: Code Quality Improvements
✅ GOOD: Noteworthy Good Practices
Generated by the MIRA automated code review pipeline (Groq → Cerebras → Gemini cascade) |
MIRA staging gate — ✅ PASSEngine + NeonDB staging branch + Groq cascade against fixed questions, graded on the 5-dimension rubric in
Rubric: |
🤖 AI Code ReviewReview by: groq (llama-3.3-70b-versatile) Review of PR: [codex] security(rag): harden tenant isolation and prompt boundaries🔴 IMPORTANT: Security vulnerabilitiesNo hardcoded secrets, SQL injection, path traversal, or command injection vulnerabilities were found in the provided diff. 🔴 IMPORTANT: Missing error handling on network/IO operationsNo missing error handling on network/IO operations that could crash in production were found in the provided diff. 🟡 WARNING: Logic bugs or incorrect assumptionsThe changes to the Specifically, the 🟡 WARNING: Missing input validation at API boundariesThe diff does not include any new API endpoints or changes to existing API endpoints. However, it's worth reviewing the API endpoints that are being used to ensure that they have proper input validation. 🔵 SUGGESTION: Code quality improvements, naming, maintainabilityThe code changes are mostly improvements to existing code. However, some variable names could be more descriptive. For example, The ✅ GOOD: Noteworthy good practices foundThe changes to the The test cases (in Overall, the changes in this PR seem to be improvements to the existing code, and no major security vulnerabilities or logic bugs were found. However, it's always a good idea to review the code again to ensure that it's correct and follows best practices. Specific file and line number comments
Generated by the MIRA automated code review pipeline (Groq → Cerebras → Gemini cascade) |
Summary
Refs #2112 and supersedes the stale #2253 hardening branch.
This PR hardens RAG tenant isolation and prompt-boundary handling end to end:
/api/knowledge/searchregression coverage so private tenant snippets are withheld at the DB/query boundary while shared OEM snippets still return.VERSIONto3.42.1andmira-hubto2.18.2.Root cause
The narrow #2112 SQL issue was already fixed on
main, but coverage only asserted query shape. The broader risk was that retrieved RAG documents were still being assembled into system-role prompt content in Hub and bot paths. That gave malicious retrieved text too much prompt authority even when delimiter stripping existed.Verification
Passed:
git diff --checkcd mira-hub && ./node_modules/.bin/vitest run src/lib/__tests__/manual-rag.test.ts src/app/api/knowledge/search/__tests__/route.is-private.test.ts && ./node_modules/.bin/tsc --noEmitcd mira-hub && ./node_modules/.bin/vitest run— 108 files, 875 tests passedcd mira-bots && /Users/charlienode/MIRA/.venv/bin/python -m pytest tests/test_unit2_citations.py tests/test_reranking.py -q && /Users/charlienode/MIRA/.venv/bin/python -m py_compile shared/workers/rag_worker.py— 55 tests passedKnown pre-existing blocker:
cd mira-bots && /Users/charlienode/MIRA/.venv/bin/python -m pytest -qstill stops during collection on unrelated adapter/dependency imports:GoogleChatAdapter,SlackChatAdapter,TeamsChatAdapter,telegram.Update, andslack_bolt.adapter.socket_mode.async_handler.Remaining risk
neon_recallstill relies on the existing tenant/shared-tenant model; this PR does not redesign that storage model.