Server-side fixes: scope shadowing, OLLAMA_PORT env, doubled resource, qdrant build, deep-merge /config by dataO1 · Pull Request #8 · automataIA/graphrag-rs

dataO1 · 2026-04-30T10:11:38Z

Five small fixes against issues that surface when graphrag-server is
exercised in a real deployment. None of them touch the feature
surface; they all sit in graphrag-server (plus a one-line workspace
Cargo.toml change). Filing them together because each is too small
to justify its own PR overhead.

Motivation

Hit each of these standing up a graphrag-rs deployment over a personal
Obsidian vault with Qdrant + Ollama on the server. Submitting upstream
because they all behave the same way for any deployment, not just
mine.

Goals

Fix /api/config so it's reachable.
Fix POST /api/documents so it doesn't 405.
Make OLLAMA_PORT actually configurable from the environment.
Unbreak the qdrant-client build under restricted/sandboxed builds.
Fix POST /config partial updates clobbering unset fields.

Changes

Five commits, each independent:

qdrant-client: opt out of generate-snippets default feature
The default generate-snippets feature panics in build.rs when
network access is restricted (Nix sandbox, isolated CI). Disabling
it doesn't affect runtime functionality — only the snippet
generator that builds in offline-incompatible ways.
graphrag-server: rename /api/config → /config to fix scope shadowing
App registers two services on the same prefix:
```
.service(scope("/api") ...)                  // apistos
...
.service(web::scope("/api/config") ...)      // plain actix
```
actix-web matches services by registration order, prefix-first.
The apistos /api scope claims any /api/* request that doesn't
match an explicit sub-route — there's no /api/config inside that
scope, so requests 404. The plain-actix block below .build() is
dead code as written.

Three constraints make a fix in place tricky:
- /api can't move past .build() (apistos scope ≠ plain
  web::scope).
- config_endpoints::* handlers can't move into the apistos /api
  scope without #[api_operation] macros (apistos's typed scope
  requires PathItemDefinition).
- Plain web::scope can't be registered before .build().
Renaming /api/config → /config sidesteps all three: no overlap
with /api, no shadowing, block stays plain actix post-.build().
The endpoint becomes reachable for the first time.

Back-compat note: technically a path change. Since the old
path 404'd in stock builds, no working caller could have depended
on it. Happy to add a /api/config alias if preferred.
graphrag-server: read OLLAMA_PORT from env (was hardcoded 11434)
Mirrors the existing OLLAMA_URL env var. One-liner.
graphrag-server: merge doubled resource("") in /api/documents scope
Two resource("") registrations under /api/documents, one for
GET (list) and one for POST (add). actix-web treats the second
as duplicate-route and silently drops one — POST returned 405.
Combine into a single resource("") with both methods chained.
config: deep-merge POST /config bodies over defaults
Previously POST /config deserialized the body to Config,
replacing the in-memory config wholesale. Partial bodies (very
common — set just the openai or just the embeddings section) reset
every unset field to its default. Now does a recursive deep merge
over the existing config: only fields explicitly present in the
body change.

Back-compat note: behavior change for callers that were
relying on the wholesale-replace semantics. Most callers I'd
expect to want the new behavior — they were probably re-sending
the entire config to avoid this — but worth flagging.

Methodology

Cherry-picked off upstream/main (c46e287).
cargo check -p graphrag-server --features qdrant,ollama clean.
cargo test -p graphrag-server --lib 12/12 pass.
cargo fmt --check clean on touched files. Pre-existing fmt
warnings in untouched upstream files left alone.
cargo clippy introduces no new warnings; pre-existing warnings
in upstream untouched.

Phase 1 - TRIVIAL fixes: - Remove unused imports from traversal.rs (Relationship, EntityMention) - Remove unused import DocumentId from string_similarity_linker.rs - Remove unused imports from bidirectional_index.rs (DocumentId, TextChunk) - Update obsolete comment in lib.rs about GraphRAG re-export Phase 2 - EASY implementations: - Implement relationships_examined counter tracking in logic_form.rs - Add GraphRAGBuilder re-export in lib.rs - Implement property extraction for Has queries in logic_form.rs * Supports querying entity properties: name, type, confidence, mentions * Returns all properties if only entity specified * Returns specific property if both entity and property specified All changes compile successfully with no warnings.

…hunks Completed 3 TODO implementations in persistence layer: 1. Relationships (save/load): - Schema: source, target, relation_type, confidence, context - Full support for relationship context tracking 2. Documents (save/load): - Schema: id, title, content, metadata, chunk_count - Preserves document metadata as parallel key-value arrays 3. Chunks (save/load): - Schema: id, document_id, content, offsets, embedding, entities - Metadata: chapter, keywords, summary - Full support for embeddings and entity references Implementation uses Arrow RecordBatch with ListBuilder for nested structures.

Completed 2 TODO implementations: 1. **Relationship Extraction in LightRAG** (graph_indexer.rs): - Implemented pattern-based relationship extraction - Supports 20+ relationship types: works_at, located_in, founded, manages, etc. - Extracts relationships between detected entities - Confidence scoring based on pattern match and entity types - Type-aware adjustments (person+organization, entity+location) 2. **Dependency Analysis in Decomposer** (decomposer.rs): - Analyzes dependencies between subqueries based on query types - Dependency types: Sequential, Reference, Context - Logic: * Relationship queries depend on Entity queries (Reference) * Attribute queries depend on Entity queries (Reference) * Comparative queries depend on Entity/Attribute queries (Reference) * Temporal queries use Entity queries for Context * Causal queries have Sequential dependencies - Automatic deduplication of dependencies Both implementations follow existing code patterns and include proper confidence scoring.

Completed TODO in api_providers.rs:332 - batch embedding support. Implementation: - New make_batch_request() method for true batch API calls - Supports all providers: OpenAI, Voyage, Cohere, Jina, Mistral, Together - Proper batch request/response format for each provider - Automatic fallback to sequential if batch fails - Validates embedding count matches input count Benefits: - Significant performance improvement for bulk operations - Reduced API calls and latency - Provider-native batch support utilized Response formats handled: - OpenAI-compatible: data[{embedding: [...]}] - Cohere: embeddings[[...]]

Completed TODO in query_concepts.rs:163 - semantic matching. Implementation: - New calculate_semantic_similarity() method - Uses Jaccard similarity (intersection/union) for semantic relatedness - Token containment scoring (query tokens in concept) - Weighted combination: 0.6*jaccard + 0.4*containment - Applies configurable semantic threshold - Lightweight proxy for true embedding-based matching This provides semantic matching without requiring pre-computed embeddings. For production with embeddings, concepts and queries should be embedded and cosine similarity calculated directly. Benefits: - Catches semantically related concepts beyond exact/fuzzy match - No embedding infrastructure required for basic semantic matching - Configurable via use_semantic_match and semantic_threshold

Completed TODO in retrieval/mod.rs:238 - parallel processing support. Implementation: - New with_parallel_processing() constructor - Accepts Arc<dyn VectorStore> for thread-safe sharing - Accepts EmbeddingGenerator for parallel operations - Integrates ParallelProcessor for batch operations Design: - VectorStore trait is already Send + Sync - Arc wrapper enables safe cross-thread usage - EmbeddingGenerator operations can use rayon for parallelization - ParallelProcessor stored for future batch operations This enables efficient parallel indexing and querying for large-scale knowledge graphs with thread-safe vector operations.

Completed TODO implementations in data_import.rs (534, 547). **Dependencies Added**: - quick-xml (0.36) for GraphML XML parsing - oxrdf (0.2) + oxttl (0.1) for RDF/Turtle parsing - New features: graphml-import, rdf-import **GraphML Parser**: - Full GraphML XML format support - Parses nodes with attributes (id, name, type) - Parses edges with source/target/type - Supports nested <data> elements with keys - Returns ImportedEntity and ImportedRelationship lists **RDF/Turtle Parser**: - Turtle/RDF triple parsing (subject-predicate-object) - Automatic entity extraction from subjects/objects - Relationship extraction from URI objects - Property extraction from literal objects - URI local name extraction (after # or /) - Default types for resources without explicit type Both parsers: - Feature-gated (#[cfg(feature = "...-import")]) - Comprehensive error handling - Processing time tracking - Return ImportResult with counts and errors Enables graph import from standard formats (GraphML, RDF/Turtle).

## LanceDB Implementation (Phase 4): - Implement new() with connection initialization and table creation/opening - Implement count() using table.count_rows() - Implement store_embedding() with Arrow RecordBatch construction - Implement search_similar() with k-nearest neighbor vector search - Add QueryBase and ExecutableQuery trait imports - Handle FixedSizeList DataType with pattern matching for arrow 57 ## Graph Embeddings (Phase 4): - Implement MaxPool aggregation (element-wise max across neighbors) - Implement Attention aggregation with softmax-normalized weights - Implement LSTM aggregation with decay-based sequential processing - Fix type inference for decay factor in LSTM ## Dependency Updates: - Update arrow dependencies from 56 to 57 (workspace + graphrag-core) - Update lancedb from 0.22.2 to 0.26.2 for arrow 57 compatibility - Use workspace arrow version in graphrag-core Cargo.toml - Enable lancedb module in persistence (feature gate: lancedb, not lance-storage) ## Bug Fixes: - Fix VectorStore delete() to return () instead of DeleteResult - Fix DataType::FixedSizeList access for arrow 57 API changes (match pattern instead of as_fixed_size_list())

## BLEU Score Implementation (Phase 5 - VERY HIGH): ### Core Algorithm: - Implement calculate_bleu_score() with n-gram precision (n=1-4) - Calculate brevity penalty: BP = exp(1 - ref_len/cand_len) - Final score: BLEU = BP * exp(1/N * sum(log(P_n))) ### Helper Methods: - calculate_ngram_precision() - Precision with clipped counts - extract_ngrams() - N-gram extraction from token sequences - Clipping logic to prevent over-counting repeated n-grams ### Integration: - Call BLEU calculation in calculate_quality_metrics() - Compute average BLEU score across benchmark queries - Add BLEU score to BenchmarkSummary output - Display BLEU in print_summary() when available ### Algorithm Details: - N-gram range: 1-4 (unigrams through 4-grams) - Modified precision with clipping to max reference counts - Geometric mean of n-gram precisions - Brevity penalty for short candidates - Returns 0.0 if any n-gram precision is 0

## LanceDB Batch Methods (Phase 4): ### store_embeddings_batch(): - Validate dimensions for all embeddings in batch - Create Arrow StringArray for IDs - Create FixedSizeListArray for embedding vectors - Build RecordBatch and add to table - Handle empty batch case gracefully ### get_embedding(): - Query table by ID using SQL filter (only_if) - Execute query and collect results - Extract embedding from FixedSizeList column - Return None if ID not found - Use TryStreamExt for async result collection ### Implementation Details: - Both methods use Arrow RecordBatch construction - Proper error handling with GraphRAGError - Tracing support for debug logging - Dimension validation before insertion LanceDB integration now complete with all 6 methods: - new() - Connection and table initialization - count() - Count rows - store_embedding() - Single embedding storage - store_embeddings_batch() - Batch storage - get_embedding() - Retrieve by ID - search_similar() - K-nearest neighbor search

## ROUGE-L Score Implementation (Phase 5 - VERY HIGH): ### Core Algorithm: - Implement calculate_rouge_l() using Longest Common Subsequence (LCS) - LCS-based precision: LCS_length / candidate_length - LCS-based recall: LCS_length / reference_length - F-score with β=1.2: ((1+β²)*P*R) / (β²*P + R) ### LCS Dynamic Programming: - Implement lcs_length() with O(m*n) time complexity - DP table: dp[i][j] = LCS of seq1[0..i] and seq2[0..j] - Recurrence: if match: dp[i][j] = dp[i-1][j-1] + 1 - Else: dp[i][j] = max(dp[i-1][j], dp[i][j-1]) ### Integration: - Call ROUGE-L calculation in calculate_quality_metrics() - Compute average ROUGE-L score across benchmark queries - Add ROUGE-L to BenchmarkSummary output - Display ROUGE-L in print_summary() when available ### Algorithm Details: - Token-based LCS (word-level, not character-level) - β=1.2 slightly favors recall over precision - Returns 0.0 for empty sequences - Clamps result to [0, 1] range

## Semantic Chunking Implementation (Phase 4 - MEDIUM-HIGH): ### Algorithm: - Split text into sentences using existing split_sentences() - Calculate lexical cohesion (Jaccard similarity) between adjacent sentences - Create chunk boundaries where similarity < threshold (default 0.7) - Merge small chunks below min_size with previous chunk - Split large chunks above max_size by sentence boundaries ### Features: - Uses existing lexical_cohesion() method for word-overlap similarity - Respects min_size, max_size, and similarity_threshold config - Calculates coherence score for each chunk - Maintains sentence and paragraph counts - Handles edge cases (empty text, single sentence, etc.) ### Implementation Details: - Lexical-based semantic similarity (word overlap) - No deep learning embeddings required (practical approach) - Still "semantic" because it respects content similarity - Efficient: O(n) where n is number of sentences Closes semantic chunking TODO at nlp/semantic_chunking.rs:329

## VectorStore LanceDB Implementation: ### add_vectors_batch(): - Implement full Arrow RecordBatch construction for batch vector insertion - Create StringArray for IDs - Create FixedSizeListArray for embeddings with proper dimension - Build schema with id (Utf8) and vector (FixedSizeList) fields - Add batch to LanceDB table using table.add() ### search(): - Implement vector similarity search with k-nearest neighbors - Use query().limit(k).nearest_to() pattern - Extract IDs from result batches - Calculate inverse ranking scores - Return SearchResult vec with id, score, metadata ### Implementation Details: - Reuses Arrow pattern from persistence/lance.rs - Proper error handling for all LanceDB operations - Empty batch handling for add_vectors_batch - Type-safe Float32Type for embeddings Closes TODO at vector/lancedb.rs:89

Implements complete builder pattern for GraphRAG configuration: - 20+ builder methods for all major config options - Fluent API: output_dir, chunk_size, embeddings, ollama, retrieval - with_local_defaults() for zero-config local setup - config() and config_mut() for advanced use cases - Full test coverage: 11/11 tests passing Unblocks TODO at lib.rs:282,1271 Enables GraphRAG::builder() method Adds to prelude for easy access

Updates: - parquet 52 -> 57 to match arrow 57 - Fix ParquetRecordBatchReaderBuilder import path - Add Array trait import for is_null() method - Wrap embeddings in Arc::new() for RecordBatch Implements embeddings save/load using ListBuilder pattern: - Save: Build ListArray from Option<Vec<f32>> - Load: Extract Vec<f32> from ListArray with null handling - Consistent with chunks embeddings implementation Completes TODO at persistence/parquet.rs:245,360

Changes test_graph_indexing to use #[tokio::test] and .await to properly handle async index_graph() method. Fixes compilation error: cannot call is_ok() on Future

Registry Service Implementations (core/registry.rs): - Expand build_registry() with comprehensive service structure - Add 8 service registration points with feature gates: * Storage (memory-storage) * Vector Store (vector-memory) * Embedding Provider (ollama) * Entity Extractor (entity-extraction) * Retriever (retrieval) * Language Model (ollama) * Metrics Collector (monitoring) * Function Registry (function-calling) - Document service registration order and requirements - Prepare for future service implementations Benchmark System Integration (monitoring/benchmark.rs): - Add pluggable architecture with function injection - New builder methods: * with_retrieval(fn) - plug in retrieval system * with_reranker(fn) - plug in cross-encoder * with_llm(fn) - plug in LLM generator - Modify benchmark_query() to use actual services when provided - Fall back to simulation mode when services not set - Enable real performance measurement with production systems Completes TODOs at: - core/registry.rs:336 - monitoring/benchmark.rs:244,250,258

Implemented execute_happened_query and execute_caused_query with multi-strategy approaches for knowledge graph reasoning. Temporal Reasoning (execute_happened_query): - Extract temporal info from relationship types (happened_before, etc.) - Parse chunk metadata.custom for date/timestamp/time fields - Detect temporal keywords in chunk content (months, days, seasons) - Use document position as narrative ordering heuristic - Return temporal contexts with confidence scoring Causal Reasoning (execute_caused_query): - Identify direct causal relationships (causes, leads_to, results_in) - Build causal chains using DFS traversal (max depth 3) - Analyze co-occurrence in chunks for implicit causality - Detect causal keywords in content (because, therefore, due to) - Rank explanations by confidence scores Both methods follow existing patterns from execute_related_query and execute_compare_query, returning VariableBinding results.

Updated README.md and graphrag-core/README.md to reflect the new RoGRAG temporal and causal reasoning capabilities. Main Changes: - Root README: Updated ROGRAG description in features section - Root README: Marked temporal and causal reasoning as completed - Core README: Added comprehensive RoGRAG section in Advanced Features New Documentation Covers: - Query decomposition (60%→75% accuracy boost) - Temporal reasoning with 4 extraction strategies - Causal reasoning with confidence-based ranking - Supported query types (identity, relationships, temporal, causal) - Feature flag configuration

Resolved remaining TODO items and clarified project boundaries. Changes: 1. Utility modules (lib.rs:151) - Removed TODO: only optional future modules - Clarified: automatic_entity_linking, phase_saver not needed - Marked as future enhancements, not blockers 2. Voy vector store (vector/mod.rs:27) - Removed TODO: already fully implemented (~500 lines) - Clarified: belongs in graphrag-wasm (WASM-specific) - Added note pointing to correct location 3. Scope cleanup - Removed Multilingual Support from roadmap (out of scope) - All core functionality TODOs now resolved - Remaining work: integration when dependencies ready Progress Summary: - 21/47 TODOs completed (45%) - 2/47 TODOs removed (out of scope) - 4/47 TODOs deferred (need dependencies) - 20/47 N/A or not applicable - Total: 87% project completion

…support - Added incremental indexing and delta computation logic - Introduced critic feedback loop for knowledge extraction - Implemented Ollama embedding and LLM adapters - Added support for LightRAG concept selection and query planning - Introduced cross-encoder reranking and adaptive retrieval - Added Python bindings in using PyO3 - Improved CLI UX with better progress monitoring - Refined .gitignore to include docs and exclude benchmark results

qdrant-client v1.15.0 ships `generate-snippets` as a default feature. Its build.rs writes generated test snippets back into its own crate dir, which breaks builds in sandboxed environments (Nix, some CI setups) because the vendored source dir is read-only. The feature is qdrant-client maintainers' internal CI test-codegen tooling; consumers don't need it. Disable workspace-wide and re-enable only `download_snapshots` + `serde`. All workspace members inherit via `{ workspace = true }`.

Upstream registers two services on the same App: .service(scope("/api") ...) // apistos ... .service(web::scope("/api/config") ...) // plain actix actix-web matches services by registration order, prefix-first. The apistos /api scope claims /api/config (which has no /config sub-route), shadowing the second block — every /api/config request returns 404. Three constraints make a fix non-trivial: - /api can't be moved post-.build() because apistos `scope` differs from plain `web::scope`. - /api/config can't be nested inside /api because apistos's typed scope requires handlers to implement `PathItemDefinition` (i.e. carry `#[api_operation]`). - Plain `web::scope` services can't be registered pre-.build() because apistos's App doesn't accept them. Renaming /api/config → /config sidesteps all three. No overlap with /api, no shadowing, block stays plain actix post-.build(). The endpoint reachable at GET /config etc.

Lets users point graphrag-server at an Ollama-protocol server on any port. With real Ollama on 11434 and a future Ollama→OVMS shim on a different port, both can coexist without conflict. - Add `ollama_port: u16` to EmbeddingConfig + Default impl (11434). - Change OLLAMA_URL default from "http://localhost:11434" to "http://localhost" since the port is now in its own field. - main.rs reads OLLAMA_PORT env var; falls back to 11434 if unset. - Use `Ollama::new(url, config.ollama_port)` instead of hardcoded 11434. Tests use `..Default::default()` so they auto-pick up the new field.

Upstream had two `.service(resource(""))` registrations at the same path: .service(resource("").route(get().to(list_documents))) .service(resource("").route(post().to(add_document))) actix-web treats each `.service(resource(""))` as a distinct resource at the same path. Only the first wins; the second is silently dropped. Result: POST /api/documents returns 405 with `allow: GET`. Merge into a single resource with chained .route() calls. Same idiom the file already uses elsewhere.

Previously /config (formerly /api/config) required the caller to send every required field of `Config` (output_dir, chunk_size, retrieval, graph, embeddings, ...). That made the endpoint unusable for the common case of "I just want to set the openai block" — exactly the shape the nix HM module synthesizes for systemd ExecStartPost. `set_from_json` now: 1. Parses the body as serde_json::Value (no struct constraint). 2. Serializes the current/active config (or Config::default() if none is set) to a Value as the merge base. 3. Deep-merges the patch onto the base (objects merged key-by-key, arrays and scalars replace). 4. Deserializes the merged Value into Config and validates as before. Net effect: any partial JSON works, fully-specified bodies still work unchanged (the merge is idempotent over a complete config), and missing required fields fall back to Config::default() instead of 400-ing. Bug surfaced as the HM ExecStartPost on neo-16 returning HTTP 400: {"error":"Bad Request","message":"Failed to parse JSON: missing field `output_dir` at line 1 column 149"} Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…mataIA#10 automataIA#11

carcall added 30 commits October 26, 2025 17:23

complete rewrite

e97df04

Add minilm-l6.onnx to .gitignore

829203f

chore: remove large ONNX model from repository

bfbeabf

add image

649d96d

feat: implement trait-based chunking architecture with cAST support

99df398

fix: make test_graph_indexing async with tokio::test

a355f08

Changes test_graph_indexing to use #[tokio::test] and .await to properly handle async index_graph() method. Fixes compilation error: cannot call is_ok() on Future

feat: kv-cache, json structured, gliner-relex

6295a1e

update

2d1d22a

update cli TUI/TUX

69da96d

add wrapper crate

c46e287

wellos and others added 5 commits April 29, 2026 14:32

This was referenced Apr 30, 2026

OpenAI-compatible chat + embeddings backend (feature-gated, opt-in) #9

Open

Graph-aware /api/query (ask/explain/reason/local) + cross-restart persistence #11

Open

dataO1 added a commit to dataO1/graphrag-rs that referenced this pull request Apr 30, 2026

PR-PLAN: filed PRs A/B/C/D upstream as automataIA#8 automataIA#9 auto…

c75f28c

…mataIA#10 automataIA#11

dataO1 mentioned this pull request May 3, 2026

Unify embeddings around Config.embeddings (single source of truth) #13

Open

automataIA force-pushed the main branch 2 times, most recently from d39471e to 84ef833 Compare May 31, 2026 13:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server-side fixes: scope shadowing, OLLAMA_PORT env, doubled resource, qdrant build, deep-merge /config#8

Server-side fixes: scope shadowing, OLLAMA_PORT env, doubled resource, qdrant build, deep-merge /config#8
dataO1 wants to merge 35 commits into
automataIA:mainfrom
dataO1:pr/server-fixes

dataO1 commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dataO1 commented Apr 30, 2026

Motivation

Goals

Changes

Methodology

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants