Skip to content

fix: add sqlite edge compound indexes#127

Open
xtfer wants to merge 5 commits intotirth8205:mainfrom
xtfer:main
Open

fix: add sqlite edge compound indexes#127
xtfer wants to merge 5 commits intotirth8205:mainfrom
xtfer:main

Conversation

@xtfer
Copy link
Copy Markdown

@xtfer xtfer commented Apr 7, 2026

Summary

  • fix a SQLite performance gap where queries filtering edges by both symbol and edge kind could only use single-column indexes
  • add compound indexes on edges(target_qualified, kind) and edges(source_qualified, kind) so summary, impact, and risk-oriented queries avoid unnecessary scans on larger graphs
  • add a v7 schema migration so existing databases pick up the new indexes automatically, not just fresh installs
  • add migration test coverage to verify the new indexes are present after initialization

Issue

Some graph queries commonly filter on both a qualified symbol and the edge kind, but the schema only had separate indexes for those columns. That meant SQLite could not optimize those lookups as effectively as it should, which can show up as avoidable work when generating summaries or running risk-related queries against bigger databases.

Testing

  • uv run pytest tests/test_migrations.py tests/test_graph.py -q

@xtfer xtfer closed this Apr 7, 2026
@xtfer xtfer reopened this Apr 7, 2026
@xtfer
Copy link
Copy Markdown
Author

xtfer commented Apr 7, 2026

Original commit made a bunch of unrelated line changes.

@tirth8205
Copy link
Copy Markdown
Owner

CI failed on the schema-sync check — your v7 migration bumps Python's LATEST_VERSION to 7, but the VS Code extension's SUPPORTED_SCHEMA_VERSION in code-review-graph-vscode/src/backend/sqlite.ts is still 6. Please update it to 7 as well, then the schema-sync check will pass.

gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 8, 2026
Resolves migration numbering conflict: our composite edge index
renumbered from v6 to v8 (v7 reserved by upstream PR tirth8205#127).
Skills.py conflict resolved keeping our hooks structure with upstream's
encoding fix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 8, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 8, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
@xtfer
Copy link
Copy Markdown
Author

xtfer commented Apr 8, 2026

@tirth8205 Updated SUPPORTED_SCHEMA_VERSION to 7 as requested.

@tirth8205
Copy link
Copy Markdown
Owner

Review: PR #127 — Branch Warning

Branch conflict: This PR is from xtfer:main targeting tirth8205:main. Using your main branch as a feature branch is problematic — any future commits to your fork's main (e.g., syncing with upstream) will silently add unrelated changes to this PR, or make it impossible to rebase cleanly.

Recommendation: Please create a dedicated branch (e.g., fix/sqlite-edge-compound-indexes), cherry-pick your commit onto it, and open a new PR from that branch. You can close this one. This is especially important because PR #158 (which depends on this PR's v7 migration slot) is tracking the branch name.

The code itself is correct: The v7 migration adding idx_edges_target_kind and idx_edges_source_kind compound indexes is well-targeted — these directly accelerate the summary, impact, and risk-oriented queries that filter on both (source/target_qualified, kind). The migration is idempotent (CREATE INDEX IF NOT EXISTS), properly registered in the MIGRATIONS dict, and the VS Code extension's SUPPORTED_SCHEMA_VERSION was correctly bumped to 7. The migration test is minimal but sufficient.

Cannot merge as-is due to the main-branch issue. Please reopen from a feature branch.

gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 11, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 11, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 11, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 11, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 11, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 11, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 11, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
gzenz added a commit to gzenz/code-review-graph that referenced this pull request Apr 11, 2026
…izations

Major improvements to code-review-graph spanning parser architecture,
call graph accuracy, and build performance.

Parser refactoring:
- Extract 16 per-language handler modules into code_review_graph/lang/
  using a strategy pattern, replacing monolithic conditionals in parser.py
- Thread-safe parser caches with double-check locking

Call graph enrichment:
- Jedi-based Python method call resolution at build time (jedi_resolver.py)
- Pre-scan filtering by project function names (36s to 3s on large repos)
- Typed variable call enrichment (Python, JS/TS, Kotlin/Java)
- Star import resolution, namespace imports, CommonJS require()
- Angular template parsing, JSX handler tracking
- Module-level import tracking and module-qualified call resolution
- Function/class references passed as call arguments

PreToolUse search enrichment:
- New enrich.py module and code-review-graph enrich CLI command
- Injects graph context (callers, flows, community, tests) into agent
  search results passively via hook

Dead code false positive reduction:
- Framework decorators recognized as entry points
- CDK construct methods, abstract overrides excluded
- E2e test directories excluded from dead code detection

Performance:
- Community detection: 48.6s to 2.3s (21x speedup) via bulk node
  loading and adjacency-indexed cohesion computation
- Jedi enrichment: 36s to 3s (12x) via pre-scan filtering
- Batch file storage (50-file transactions)
- Batch risk_index (2 GROUP BY queries replace per-node loops)

Other:
- Weighted flow risk scoring by criticality
- Transitive TESTED_BY lookup for tests_for and risk scoring
- DB schema v8: composite edge index (v7 reserved by PR tirth8205#127)
- --quiet and --json CLI flags
- Search query deduplication, test function deprioritization
- New [enrichment] optional dependency group for Jedi
- 829+ tests across 26 test files (up from 615)

Evaluated against Gadgetbridge (41k nodes, 280k edges): 8/10 PASS,
call resolution rate improved from 28% to 39.6%.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants