Skip to content

fix: reconcile_all_paths spawn_blocking + persist_config in prune path#101

Merged
flupkede merged 2 commits into
developfrom
fix/serve-reconcile-blocking-and-persist-config
Jun 2, 2026
Merged

fix: reconcile_all_paths spawn_blocking + persist_config in prune path#101
flupkede merged 2 commits into
developfrom
fix/serve-reconcile-blocking-and-persist-config

Conversation

@flupkede
Copy link
Copy Markdown
Owner

@flupkede flupkede commented Jun 2, 2026

Summary

  • reconcile_all_paths offloaded to spawn_blocking — this function holds the config RwLock write-guard while spawning git subprocesses and traversing the filesystem. Running it directly on a Tokio async worker thread would starve the runtime during startup. Wrapped in tokio::task::spawn_blocking so the async runtime remains responsive.
  • Phase 1 auto-prune uses persist_config instead of config.save() — the prune path that removes stale aliases was calling config.save() directly, bypassing ServeState::persist_config. All config save sites in ServeState must route through persist_config so the config_path_override (used in integration tests) is honoured. Fixed to self.persist_config(&config).

Test plan

  • cargo fmt --check — PASSED
  • cargo check --all-targets — PASSED
  • cargo clippy --all-targets -- -D warnings — PASSED
  • cargo test --lib — 429 passed, 0 failed

🤖 Generated with Claude Code

flupkede and others added 2 commits June 2, 2026 15:49
Two correctness fixes flagged in post-release review:

1. reconcile_all_paths() was called synchronously inside tokio::spawn, blocking
   a Tokio worker thread while spawning git subprocesses and holding the config
   RwLock write-guard. Moved to spawn_blocking so the async runtime stays
   responsive during startup reconciliation.

2. Phase 1 auto-prune wrote repos.json via config.save() instead of
   self.persist_config(&config). All other ServeState save sites use
   persist_config to honour config_path_override (e.g. in tests). Now
   consistent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rsist_config prune fix)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@flupkede flupkede merged commit 8748657 into develop Jun 2, 2026
1 check passed
flupkede added a commit that referenced this pull request Jun 2, 2026
…doc-comments (#102)

* fix: doc-comments accuracy + safe_canonicalize in reload_if_changed

- repos.rs: correct "Pure (no disk I/O)" doc-comments on relocate_missing and
  prune_stale — both transitively call scan_for_remote which does read_dir and
  spawns a git subprocess; callers must use spawn_blocking in async contexts.
- serve/mod.rs: replace raw std::fs::canonicalize with safe_canonicalize in
  reload_if_changed so Windows UNC prefix (\?\) is stripped before comparison,
  consistent with the project-wide safe_canonicalize rule.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: index/mod.rs quality — extract ensure_hnsw_index_if_needed + tests + metadata consistency + cancel best-effort

- Extract the safety-net HNSW rebuild into ensure_hnsw_index_if_needed() so
  the logic is unit-testable; add 3 tests (unindexed with chunks rebuilds,
  already-indexed is idempotent, empty DB skips rebuild).
- metadata.json schema consistency: add "partial": false to the normal
  (non-cancelled) path so readers always see the field regardless of how
  indexing ended.
- Cancellation finalisation path: change non-critical ? propagations to
  log-and-continue (metadata.json write, FileMetaStore update/save, stats
  read) — keeps partial chunks searchable even if any recovery step fails.
  store.build_index() still propagates errors as before.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: concurrency — evaluate_csharp_rebuild outside write lock + build_index in spawn_blocking

Three related concurrency fixes in serve/mod.rs, consistent with the
reconcile_all_paths fix from PR #101:

1. evaluate_csharp_rebuild: bootstrap_last_changed (git subprocess + ≤10k
   file fs-walk) was running while holding config.write(), blocking every
   concurrent config.read() for the scan duration. Fixed by checking whether
   bootstrap is needed under a read lock, running the slow I/O with no lock
   held, then taking the write lock only for the brief config update.

2. evaluate_csharp_rebuild call site in run_phase_2_csharp_scip: even with the
   above fix the function still ran synchronously on a Tokio worker thread.
   Wrapped in spawn_blocking so the async runtime stays responsive while
   scanning all C# candidates at startup.

3. warmup_repo and add_repo_handler background task: two build_index() calls
   were running directly on async threads while holding a tokio RwLock write
   guard. build_index() is CPU-heavy (HNSW construction). Both are now
   offloaded via spawn_blocking + blocking_write(), matching the established
   pattern at serve/mod.rs:1249.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: CHANGELOG entry for v1.0.160 (full review fixes)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: flupkede <flupkede@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant