Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
cc67986
Merge pull request #56 from flupkede/fix/tui-indexing-status
flupkede May 20, 2026
2015859
fix: CI test resilience + protect-master workflow (#58)
flupkede May 21, 2026
4c1182c
Sync master → develop (tree-sitter) (#60)
flupkede May 21, 2026
78ef160
docs: update CHANGELOG — v1.0.132 consolidated release notes (#61)
flupkede May 22, 2026
71b8ed3
sync: align develop with master — AGENTS.md, Cargo.toml, Cargo.lock (…
flupkede May 22, 2026
4296b1c
fix: MCP local mode project/group fallback + QC script fix (#66)
flupkede May 27, 2026
d0f3929
chore: simplify release workflow — feature-only version bump (#74)
flupkede May 27, 2026
b9b5645
fix: serve-aware indexing — create DB dir before lock + no silent loc…
flupkede Jun 1, 2026
d947d71
sync: align develop with master (post-v1.0.137 release) (#78)
flupkede Jun 1, 2026
b09a930
fix: strip UNC prefix in repos.json + auto-add on missing DB (v1.0.13…
flupkede Jun 1, 2026
70877ef
sync: align develop with master (post-v1.0.138) (#81)
flupkede Jun 1, 2026
e989ee2
refactor: central safe_canonicalize() — eliminate raw .canonicalize()…
flupkede Jun 1, 2026
acc788d
fix: replace last raw .canonicalize() with safe_canonicalize in get_d…
flupkede Jun 1, 2026
0bc928f
fix: wait for serve warmup instead of refusing; fix 409 on DB-recreat…
flupkede Jun 1, 2026
708cf27
fix: keep serve responsive during warmup — offload heavy work to spaw…
flupkede Jun 1, 2026
3c740c0
sync: backfill CHANGELOG 1.0.139-1.0.142 from master (post-release) (…
flupkede Jun 1, 2026
54782bd
feat: semantic Markdown chunking (tree-sitter-md) + /merge & /release…
flupkede Jun 1, 2026
bc88eba
Stale-path relocation, index prune, derived alias (v1.0.152) (#92)
flupkede Jun 1, 2026
e0e3580
Auto-prune stale repos during Phase 1 warmup (v1.0.153) (#94)
flupkede Jun 2, 2026
fb7858c
docs: AGENTS.md plan for fixes-strict-scoping-and-reaper (#95)
flupkede Jun 2, 2026
d7d71b7
docs: AGENTS.md plan for serve single-instance guard (#96)
flupkede Jun 2, 2026
b748977
fix: Windows 8.3 short-name path mismatch in relocation tests + CHANG…
flupkede Jun 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Add symbol-aware reference lookups to codesearch via `find_impact` MCP tool. Ret
- **`-with-csharp` release variants** — 6 release archives (3 plain + 3 with helper)
- **Gated integration test** — `csharp_helper_integration` cargo feature for full-pipeline testing
- **CI** — separate `csharp-integration-tests` job in `.github/workflows/ci.yml`
- **Stale-path resilience + derived alias** — moved/renamed indexed folders no longer crash serve: `git_remote` captured at registration, `reconcile_all_paths()` best-effort relocates by matching `remote.origin.url` (bounded depth, `CODESEARCH_RELOCATE_MAX_DEPTH`, default 3) else warn+skip; `codesearch index prune` for manual cleanup. The `--alias` flag was removed (alias always = directory name). `ReposConfig::reconcile()` hardens hand-edited `repos.json` on load. See AGENTS.md for details.

## Architecture

Expand Down
91 changes: 91 additions & 0 deletions .claude/commands/merge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
description: Land the current feature branch on develop — README/CHANGELOG checks, commit, push, PR, auto-merge
argument-hint: [optional PR title]
allowed-tools: Bash(git:*), Bash(gh:*), Bash(cargo:*), Bash(grep:*), Read, Edit, Grep, Glob
---

# /merge — land the current feature branch on `develop`

Run the project's **merge workflow**: verify docs are current, then bring the current
feature branch into `develop` through a pull request. This command does **not** tag a
release — tagging happens only in `/release`.

## Branch & version facts (this repo)
- Flow: `feature/*` | `features/*` | `fix/*` → PR → **`develop`** → (later) PR → **`master`**.
- `master` is protected (`.github/workflows/protect-master.yml`): it accepts PRs only from
`develop` or `release/*`.
- The pre-commit hook **bumps the patch version (+1) and rebuilds the binary on feature
branches only** (`feature/*`, `features/*`, `fix/*`). On `develop`/`master`/`release`/`chore`
it runs `cargo fmt` only — no bump. So **the feature-branch commit here fixes the release
version**; it carries forward unchanged through develop, master, and the tag.

## Guardrails
- ABORT unless the current branch matches `feature/*`, `features/*`, or `fix/*` — i.e. the
branches the pre-commit hook version-bumps. Never run from `develop`, `master`, `release/*`,
or `chore/*`: on those the hook does **not** bump, so the version/CHANGELOG premise below
would silently break.
- NEVER push directly to `develop` or `master` — everything lands via a PR.
- NEVER pass `--no-verify` / `--no-gpg-sign` — let the pre-commit hook run (it bumps + rebuilds).
- Do NOT create or push a tag here. That is `/release`'s job.
- Do NOT force-push.

## Steps

1. **Context**
- `git rev-parse --abbrev-ref HEAD` → current branch. If it is NOT `feature/*`, `features/*`,
or `fix/*`, STOP with an error (see Guardrails).
- `git fetch origin`.
- Compute the change set landing on develop: `git log origin/develop..HEAD --oneline`
plus `git status --short` for uncommitted work. If there is nothing to land, report and STOP.

2. **README up to date?**
- Inspect the change set for user-facing changes: new/removed CLI flags or subcommands,
behavior changes, new env vars, new supported languages, new MCP tools.
- Compare against `README.md`. If anything is missing, wrong, or stale, **UPDATE `README.md`**
so it matches reality. Keep examples free of hardcoded config strings (per CLAUDE.md).
- If README already matches, state that and move on.

3. **CHANGELOG up to date?**
- Ensure `CHANGELOG.md` has an entry for this change under a `## [X.Y.Z] - YYYY-MM-DD`
heading with `Added` / `Changed` / `Fixed` subsections describing every user-facing change.
- **Version for the heading**: the hook bumps the patch by +1 on **every** feature-branch
commit where the working-tree version still equals HEAD's. The most reliable approach is to
land this branch in a **single commit** — then the heading version = current
`Cargo.toml` version + 1 (`grep -m1 '^version' Cargo.toml`). If you commit more than once,
the version advances once per commit; after the final commit, read the actual
`Cargo.toml` version and make sure the CHANGELOG heading matches it (fix it if not).
- Use today's date. If an accurate entry already exists for the pending version, leave it.

4. **Commit**
- Stage code + doc changes (`git add -A`, plus `git add -f` for any tracked-but-gitignored file).
- Commit with a clear, scoped message. End the message with:
`Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>`
- Let the pre-commit hook finish (fmt → version bump → rebuild). This can take 60–120s.

5. **Validate** (fast loop, per CLAUDE.md — do NOT run `--release`):
- `cargo fmt --all -- --check`
- `cargo check --all-targets`
- `cargo clippy --all-targets -- -D warnings`
- Fix any failures and commit again before pushing. Never push code that fails these.

6. **Push**
- `git push -u origin HEAD`.

7. **Open PR → develop**
- `gh pr create --base develop --head <branch> --title "<title>" --body "<body>"`.
- Title: use `$ARGUMENTS` if provided; otherwise summarize the branch concisely.
- Body: bullet summary of changes; end with:
`🤖 Generated with [Claude Code](https://claude.com/claude-code)`.
- Capture the PR number for the next step:
`PR=$(gh pr view --json number --jq .number)`.

8. **Auto-merge after CI**
- This repo **disallows merge commits** — always use `--squash`. NEVER `--merge`
(it fails with "Merge commits are not allowed on this repository").
- `gh pr merge "$PR" --auto --squash` so the PR lands automatically once required checks pass.
- If auto-merge is not enabled on the repo (command errors), fall back: poll
`gh pr checks "$PR" --watch`, then `gh pr merge "$PR" --squash` once green.

## Report
Branch, pending release version, doc updates made, PR URL, and merge status
(auto-merge enabled / merged).
64 changes: 64 additions & 0 deletions .claude/commands/release.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
description: Cut a release — run /merge (feature → develop), then promote develop → master and push the version tag
argument-hint: [optional PR/release title]
allowed-tools: Bash(git:*), Bash(gh:*), Bash(cargo:*), Bash(grep:*), Read, Edit, Grep, Glob
---

# /release — full release: land on `develop`, promote to `master`, tag

This is `/merge` **plus** the `develop → master` promotion and the version-tag push that
triggers the build/publish pipeline.

## Branch & version facts (this repo)
- Flow: `feature/*` → PR → **`develop`** → PR → **`master`** → push tag `vX.Y.Z`.
- `master` is protected: PRs to it may come **only** from `develop` or `release/*`
(`.github/workflows/protect-master.yml`).
- Pushing a `vX.Y.Z` tag triggers `.github/workflows/release.yml` (builds Windows/Linux/macOS
archives, plain + `-with-csharp`, and publishes the GitHub release). **Push the tag only
AFTER the develop→master PR has merged.**
- The version is fixed by the feature-branch commit (the pre-commit hook bumps only on
feature branches). develop/master merges and the tag all carry that same version.

## Guardrails
- NEVER use `--no-verify`. NEVER force-push shared branches.
- Push the tag exactly once, only after master has the release commit.
- If CI fails at any gate, STOP and report — do not promote or tag a red build.

## Part 1 — land on `develop` (the `/merge` workflow)
Execute every step of **`/merge`** (README/CHANGELOG checks → commit → push → PR → auto-merge
to `develop`). Then **wait for the develop PR to actually merge** (auto-merge waits on CI):
- Capture the PR number (`PR=$(gh pr view --json number --jq .number)`), then poll
`gh pr view "$PR" --json state,mergedAt,mergeStateStatus` until `state` is `MERGED`.
- If checks fail, STOP and report. Do not proceed to Part 2.

## Part 2 — promote `develop` → `master`
1. `git fetch origin && git checkout develop && git pull --ff-only origin develop`.
2. Determine the release version: `VERSION=v$(grep -m1 '^version' Cargo.toml | sed -E 's/.*"(.+)".*/\1/')`.
3. Open the release PR (source `develop`, which protect-master allows):
- `gh pr create --base master --head develop --title "Release $VERSION — <summary>" --body "<body>"`.
- Title: prefix `Release $VERSION — ` then a short summary (or `$ARGUMENTS` if provided),
matching history (e.g. `Release v1.0.142 — serve responsive during warmup`).
- Body ends with: `🤖 Generated with [Claude Code](https://claude.com/claude-code)`.
- Capture the PR number: `RELEASE_PR=$(gh pr view develop --json number --jq .number)`.
4. This repo **disallows merge commits** — always use `--squash`, never `--merge`.
`gh pr merge "$RELEASE_PR" --auto --squash`. Wait until `state` is
`MERGED` (poll as in Part 1). If auto-merge is unavailable, `gh pr checks "$RELEASE_PR" --watch`
then `gh pr merge "$RELEASE_PR" --squash`. If CI fails, STOP.

## Part 3 — tag the release
1. `git fetch origin --tags && git checkout master && git pull --ff-only origin master`.
2. Confirm the version on master matches: `grep -m1 '^version' Cargo.toml` equals `$VERSION` (minus the `v`).
If it does not match, STOP and report (do not guess a tag).
3. Guard against a double release: if `$VERSION` already exists as a tag
(`git tag -l "$VERSION"` non-empty, or `git ls-remote --tags origin "$VERSION"` non-empty),
STOP — the release was already cut.
4. `git tag "$VERSION" && git push origin "$VERSION"` → triggers `release.yml`.
5. Report the pushed tag and remind the user to watch the Actions "Release" run for artifacts.

## Part 4 — keep `develop` in sync (only if needed)
If `master` ended up ahead of `develop` (e.g. a CHANGELOG/version edit merged only on master),
open a sync PR `master → develop` (or fast-forward develop) — matching the repo's post-release
sync convention (e.g. PR #90 "sync: backfill CHANGELOG … from master"). Skip if already in sync.

## Report
develop PR URL, release PR URL, tag pushed (`vX.Y.Z`), final version, and sync action (if any).
19 changes: 19 additions & 0 deletions .github/workflows/protect-master.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: Protect master branch

on:
pull_request:
branches: [master]

jobs:
check-source-branch:
runs-on: ubuntu-latest
steps:
- name: Ensure PR targets master only from develop or release branches
run: |
SOURCE="${{ github.head_ref }}"
if [[ "$SOURCE" != "develop" && ! "$SOURCE" =~ ^release/ ]]; then
echo "::error::PRs to master must come from develop or release/* branches. Got: $SOURCE"
echo "Please target your PR to develop instead."
exit 1
fi
echo "OK: PR source is '$SOURCE'"
62 changes: 61 additions & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,10 @@ Add symbol-aware reference lookups to codesearch via `find_impact` MCP tool. Ret
- **Gated integration test** — `csharp_helper_integration` cargo feature for full-pipeline testing
- **CI** — separate `csharp-integration-tests` job in `.github/workflows/ci.yml`
- **Sequential phase-2 startup** — Phase 1 warms repos sequentially, Phase 2 runs gated C# SCIP rebuilds ordered by `last_changed_unix` under `Semaphore(concurrency)` via `CSHARP_SCIP_CONCURRENCY` env (default **2**, clamp [1,4])
- **`repos_meta` tracking** — `RepoMeta` (last_changed_unix, last_scip_indexed_unix) persisted in `repos.json` with debounced save (10s window)
- **`repos_meta` tracking** — `RepoMeta` (last_changed_unix, last_scip_indexed_unix, git_remote) persisted in `repos.json` with debounced save (10s window)
- **Stale-path resilience** — a renamed/moved indexed folder no longer crashes serve. `git_remote` (`remote.origin.url`) is captured at registration; on startup `ServeState::reconcile_all_paths()` best-effort relocates a missing repo by scanning the nearest existing ancestor (bounded depth, env `CODESEARCH_RELOCATE_MAX_DEPTH`, default 3) for a git root with a matching remote — exactly one match rewrites `repos.json`, otherwise warn + skip. Phase-2/Phase-3 also guard `path.exists()`. Manual cleanup via **`codesearch index prune`** (relocate-first, else unregister stale aliases)
- **Alias is always derived** — the user-settable `--alias`/`-a` flag was removed from `index add`; the alias always equals the (sanitized) directory name via `ReposConfig::register()`. The alias remains the internal identifier (repos.json key, groups, `project` arg); only user override is gone. The `index symbol <alias>` positional is a lookup key and is retained
- **Hand-edited `repos.json` tolerated** — `ReposConfig::reconcile()` runs in-memory on every load: drops empty-alias entries, drops orphan `repos_meta`, prunes group members referencing unknown aliases and empty groups. Never renames valid aliases, never crashes
- **TUI C# indicator** — in status column: green `C#·` ready, yellow `C#…` indexing, red `C#!` error; footer shows helper availability; Calls column with tool call count
- **Phase 2 & 3 TUI feedback** — Phase 2 pre-marks all queued candidates as `C#…` immediately on discovery (before semaphore slot); Phase 3 pre-warm sets `csharp_index_status = Indexing` before `batch-find-refs` and restores `Ready` after — TUI shows `C#…` throughout without touching `active_reindexes` (avoids blocking HTTP /reindex)
- **Selective ref cache invalidation** — incremental rebuilds only purge cached refs for affected symbols, not entire cache
Expand Down Expand Up @@ -176,8 +179,44 @@ Debugging this required reading serve logs — no user-visible indication that D

---

## ⚠️ Canonical Path Policy — MANDATORY

**On Windows `Path::canonicalize()` returns `\\?\C:\...` (extended-length UNC prefix).
Using this prefix in `.join()`, `Path::exists()`, or storing it in `repos.json` is
unreliable and has caused multiple recurring bugs.**

### Rule: NEVER call `.canonicalize()` directly. Always use `safe_canonicalize()`.

```rust
// ❌ FORBIDDEN everywhere in the codebase
let p = path.canonicalize()?;

// ✅ REQUIRED — central entry point, strips \\?\ prefix
use crate::cache::safe_canonicalize;
let p = safe_canonicalize(path)?;
```

`safe_canonicalize` is defined in `src/cache/file_meta.rs` and exported via `crate::cache`.
It calls `canonicalize()` then `strip_unc_prefix()` and returns the same `io::Result` type.

If you need to strip the prefix from an already-canonicalized `PathBuf`, use `strip_unc_prefix(path)`.

The regression test class is `safe_canonicalize_on_existing_dir_returns_plain_path` in
`src/cache/file_meta.rs` and `register_strips_unc_prefix_from_stored_path` in
`src/db_discovery/repos.rs`. If you add a new `canonicalize()` call and bypass this rule,
those tests will pass but a path-operation bug will manifest at runtime on Windows.

---

## Remaining work

- [ ] **Warmup blocks tokio runtime** — `perform_incremental_refresh_with_stores` in
`src/index/manager.rs` does synchronous file I/O (`FileWalker::walk`, `FileMetaStore::load_or_create`,
file hashing) and CPU-intensive embedding directly on the async executor without `spawn_blocking`.
During Phase 1 warmup of 15+ repos this starves the tokio threadpool, causing `/health` to time out.
Mitigation already in place: the CLI waits up to ~2 min for serve to become ready before refusing.
Real fix: wrap the sync-I/O-heavy sections inside `tokio::task::spawn_blocking` so the executor
stays responsive. This is a non-trivial refactor (the fn is async and takes `&SharedStores`).
- [ ] Verify on live large repo: 1st `find_impact` call triggers lazy find-refs, 2nd+ call < 100ms (cache hit)
- [ ] CI green on `csharp-integration-tests` job *(first run after push)*
- [ ] Minor: warn if `--filter-project` passed to `find-refs` CLI (currently silently ignored)
Expand Down Expand Up @@ -231,6 +270,27 @@ LMDB **does not allow** two `EnvOpenOptions::open()` handles on the same directo

---

## Release workflow — `/merge` and `/release`

Two committed Claude Code slash commands codify the release process
(`.claude/commands/merge.md`, `.claude/commands/release.md`; force-added past `.gitignore`).

- **`/merge`** — land the current feature branch on `develop`: README/CHANGELOG freshness
checks → commit → `cargo fmt`/`check`/`clippy` → push → PR to `develop` → `gh pr merge --auto`
(lands after CI). Does **not** tag.
- **`/release`** — `/merge`, then promote `develop` → `master` via a `Release vX.Y.Z` PR
(`protect-master.yml` allows PRs to `master` only from `develop` or `release/*`), then push
the `vX.Y.Z` tag that triggers `.github/workflows/release.yml` (6 archives, plain +
`-with-csharp`). Includes an optional post-release `master → develop` sync.

**Version rule (encoded in the commands):** the `pre-commit` hook bumps the patch (+1) and
rebuilds **only on `feature/*` | `features/*` | `fix/*` branches**; `develop`/`master`/`release`/
`chore` get `cargo fmt` only. So the release version is fixed at the feature-branch commit and
carries forward unchanged through develop, master, and the tag. `/merge` therefore aborts unless
run from a feature/fix branch.

---

## Live Test Report — 2026-05-08

**Versie**: codesearch v1.0.93+416
Expand Down
Loading
Loading