docs: Phase 2 gap analysis - code-specific retrieval (#27)#54
Merged
Conversation
Specifies how OGRE layers code semantics on top of oxidizedRAG without forking it: a CodeRetriever trait, AST-aware indexing via existing tree-sitter feature, a typed dependency graph (calls / imports / references_type) in data-fabric, and an impact-analysis query that joins both layers. Closes the design spec deliverable for #27; prototype implementation tracked in #33 (Phase 3 retrieval layer) and #37 (Phase 4 PR reviewer). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promote fq-name canonicalization from "open question" to a foundational decision with concrete forms per language. Add type sketches for the CodeRetriever trait (FqName, SymbolQuery, Resolution, CallerEdge, ImpactOptions, ImpactSet, CodeRetrieverError). Bound impact() with a result cap, high-fan-in skip, and explicit truncated flag. Replace the confidence-score language on unresolved edges with a categorical Resolution. Add storage rationale for SurrealDB + data-fabric. Add a failure model section (parse errors, index lag, ambiguous resolution, partial writes, rename handling). Split test detection into attribute-based (preferred) and name-based (fallback). Expand the performance budget with 10K and 1M LOC sanity bounds and cache assumptions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the design-spec deliverable for #27. Specifies how OGRE adds code semantics on top of oxidizedRAG without forking it: a
CodeRetrievertrait, AST-aware indexing via the existing tree-sitter feature flag, a typed dependency graph (calls/imports/references_type) persisted todata-fabric, and an impact-analysis query that joins the structural graph with text-embedding retrieval.Picks up where Phase 1 left off (
assessments/PHASE1_OXIDIZEDRAG_ASSESSMENT.md): Phase 1 confirmedtree-sitterexists in oxidizedRAG but isn't threaded through retrieval ranking; Phase 2 spec resolves that gap as an integration layer, not a core fork.Deliverable
assessments/PHASE2_GAP_CODE_SPECIFIC_RETRIEVAL.md(197 lines) — design spec only. No prototype code; that ships in [Phase 3] Design OGRE Retrieval - Code-Aware Integration Layer #33 and [Phase 4] Implement PR Reviewer Agent Prototype #37.What it covers
CodeRetrievertrait (6 methods) — five structural, one delegating to existing oxidizedRAG semantic search.What it deliberately does NOT do
lang:namespacing default.Test plan
🤖 Generated with Claude Code