Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .claude/commands/merge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
description: Land the current feature branch on develop — README/CHANGELOG checks, commit, push, PR, auto-merge
argument-hint: [optional PR title]
allowed-tools: Bash(git:*), Bash(gh:*), Bash(cargo:*), Bash(grep:*), Read, Edit, Grep, Glob
---

# /merge — land the current feature branch on `develop`

Run the project's **merge workflow**: verify docs are current, then bring the current
feature branch into `develop` through a pull request. This command does **not** tag a
release — tagging happens only in `/release`.

## Branch & version facts (this repo)
- Flow: `feature/*` | `features/*` | `fix/*` → PR → **`develop`** → (later) PR → **`master`**.
- `master` is protected (`.github/workflows/protect-master.yml`): it accepts PRs only from
`develop` or `release/*`.
- The pre-commit hook **bumps the patch version (+1) and rebuilds the binary on feature
branches only** (`feature/*`, `features/*`, `fix/*`). On `develop`/`master`/`release`/`chore`
it runs `cargo fmt` only — no bump. So **the feature-branch commit here fixes the release
version**; it carries forward unchanged through develop, master, and the tag.

## Guardrails
- ABORT unless the current branch matches `feature/*`, `features/*`, or `fix/*` — i.e. the
branches the pre-commit hook version-bumps. Never run from `develop`, `master`, `release/*`,
or `chore/*`: on those the hook does **not** bump, so the version/CHANGELOG premise below
would silently break.
- NEVER push directly to `develop` or `master` — everything lands via a PR.
- NEVER pass `--no-verify` / `--no-gpg-sign` — let the pre-commit hook run (it bumps + rebuilds).
- Do NOT create or push a tag here. That is `/release`'s job.
- Do NOT force-push.

## Steps

1. **Context**
- `git rev-parse --abbrev-ref HEAD` → current branch. If it is NOT `feature/*`, `features/*`,
or `fix/*`, STOP with an error (see Guardrails).
- `git fetch origin`.
- Compute the change set landing on develop: `git log origin/develop..HEAD --oneline`
plus `git status --short` for uncommitted work. If there is nothing to land, report and STOP.

2. **README up to date?**
- Inspect the change set for user-facing changes: new/removed CLI flags or subcommands,
behavior changes, new env vars, new supported languages, new MCP tools.
- Compare against `README.md`. If anything is missing, wrong, or stale, **UPDATE `README.md`**
so it matches reality. Keep examples free of hardcoded config strings (per CLAUDE.md).
- If README already matches, state that and move on.

3. **CHANGELOG up to date?**
- Ensure `CHANGELOG.md` has an entry for this change under a `## [X.Y.Z] - YYYY-MM-DD`
heading with `Added` / `Changed` / `Fixed` subsections describing every user-facing change.
- **Version for the heading**: the hook bumps the patch by +1 on **every** feature-branch
commit where the working-tree version still equals HEAD's. The most reliable approach is to
land this branch in a **single commit** — then the heading version = current
`Cargo.toml` version + 1 (`grep -m1 '^version' Cargo.toml`). If you commit more than once,
the version advances once per commit; after the final commit, read the actual
`Cargo.toml` version and make sure the CHANGELOG heading matches it (fix it if not).
- Use today's date. If an accurate entry already exists for the pending version, leave it.

4. **Commit**
- Stage code + doc changes (`git add -A`, plus `git add -f` for any tracked-but-gitignored file).
- Commit with a clear, scoped message. End the message with:
`Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>`
- Let the pre-commit hook finish (fmt → version bump → rebuild). This can take 60–120s.

5. **Validate** (fast loop, per CLAUDE.md — do NOT run `--release`):
- `cargo fmt --all -- --check`
- `cargo check --all-targets`
- `cargo clippy --all-targets -- -D warnings`
- Fix any failures and commit again before pushing. Never push code that fails these.

6. **Push**
- `git push -u origin HEAD`.

7. **Open PR → develop**
- `gh pr create --base develop --head <branch> --title "<title>" --body "<body>"`.
- Title: use `$ARGUMENTS` if provided; otherwise summarize the branch concisely.
- Body: bullet summary of changes; end with:
`🤖 Generated with [Claude Code](https://claude.com/claude-code)`.
- Capture the PR number for the next step:
`PR=$(gh pr view --json number --jq .number)`.

8. **Auto-merge after CI**
- `gh pr merge "$PR" --auto --merge` so the PR lands automatically once required checks pass.
- If auto-merge is not enabled on the repo (command errors), fall back: poll
`gh pr checks "$PR" --watch`, then `gh pr merge "$PR" --merge` once green.

## Report
Branch, pending release version, doc updates made, PR URL, and merge status
(auto-merge enabled / merged).
63 changes: 63 additions & 0 deletions .claude/commands/release.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
description: Cut a release — run /merge (feature → develop), then promote develop → master and push the version tag
argument-hint: [optional PR/release title]
allowed-tools: Bash(git:*), Bash(gh:*), Bash(cargo:*), Bash(grep:*), Read, Edit, Grep, Glob
---

# /release — full release: land on `develop`, promote to `master`, tag

This is `/merge` **plus** the `develop → master` promotion and the version-tag push that
triggers the build/publish pipeline.

## Branch & version facts (this repo)
- Flow: `feature/*` → PR → **`develop`** → PR → **`master`** → push tag `vX.Y.Z`.
- `master` is protected: PRs to it may come **only** from `develop` or `release/*`
(`.github/workflows/protect-master.yml`).
- Pushing a `vX.Y.Z` tag triggers `.github/workflows/release.yml` (builds Windows/Linux/macOS
archives, plain + `-with-csharp`, and publishes the GitHub release). **Push the tag only
AFTER the develop→master PR has merged.**
- The version is fixed by the feature-branch commit (the pre-commit hook bumps only on
feature branches). develop/master merges and the tag all carry that same version.

## Guardrails
- NEVER use `--no-verify`. NEVER force-push shared branches.
- Push the tag exactly once, only after master has the release commit.
- If CI fails at any gate, STOP and report — do not promote or tag a red build.

## Part 1 — land on `develop` (the `/merge` workflow)
Execute every step of **`/merge`** (README/CHANGELOG checks → commit → push → PR → auto-merge
to `develop`). Then **wait for the develop PR to actually merge** (auto-merge waits on CI):
- Capture the PR number (`PR=$(gh pr view --json number --jq .number)`), then poll
`gh pr view "$PR" --json state,mergedAt,mergeStateStatus` until `state` is `MERGED`.
- If checks fail, STOP and report. Do not proceed to Part 2.

## Part 2 — promote `develop` → `master`
1. `git fetch origin && git checkout develop && git pull --ff-only origin develop`.
2. Determine the release version: `VERSION=v$(grep -m1 '^version' Cargo.toml | sed -E 's/.*"(.+)".*/\1/')`.
3. Open the release PR (source `develop`, which protect-master allows):
- `gh pr create --base master --head develop --title "Release $VERSION — <summary>" --body "<body>"`.
- Title: prefix `Release $VERSION — ` then a short summary (or `$ARGUMENTS` if provided),
matching history (e.g. `Release v1.0.142 — serve responsive during warmup`).
- Body ends with: `🤖 Generated with [Claude Code](https://claude.com/claude-code)`.
- Capture the PR number: `RELEASE_PR=$(gh pr view develop --json number --jq .number)`.
4. `gh pr merge "$RELEASE_PR" --auto --merge`. Wait until `state` is
`MERGED` (poll as in Part 1). If auto-merge is unavailable, `gh pr checks "$RELEASE_PR" --watch`
then `gh pr merge "$RELEASE_PR" --merge`. If CI fails, STOP.

## Part 3 — tag the release
1. `git fetch origin --tags && git checkout master && git pull --ff-only origin master`.
2. Confirm the version on master matches: `grep -m1 '^version' Cargo.toml` equals `$VERSION` (minus the `v`).
If it does not match, STOP and report (do not guess a tag).
3. Guard against a double release: if `$VERSION` already exists as a tag
(`git tag -l "$VERSION"` non-empty, or `git ls-remote --tags origin "$VERSION"` non-empty),
STOP — the release was already cut.
4. `git tag "$VERSION" && git push origin "$VERSION"` → triggers `release.yml`.
5. Report the pushed tag and remind the user to watch the Actions "Release" run for artifacts.

## Part 4 — keep `develop` in sync (only if needed)
If `master` ended up ahead of `develop` (e.g. a CHANGELOG/version edit merged only on master),
open a sync PR `master → develop` (or fast-forward develop) — matching the repo's post-release
sync convention (e.g. PR #90 "sync: backfill CHANGELOG … from master"). Skip if already in sync.

## Report
develop PR URL, release PR URL, tag pushed (`vX.Y.Z`), final version, and sync action (if any).
21 changes: 21 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,27 @@ LMDB **does not allow** two `EnvOpenOptions::open()` handles on the same directo

---

## Release workflow — `/merge` and `/release`

Two committed Claude Code slash commands codify the release process
(`.claude/commands/merge.md`, `.claude/commands/release.md`; force-added past `.gitignore`).

- **`/merge`** — land the current feature branch on `develop`: README/CHANGELOG freshness
checks → commit → `cargo fmt`/`check`/`clippy` → push → PR to `develop` → `gh pr merge --auto`
(lands after CI). Does **not** tag.
- **`/release`** — `/merge`, then promote `develop` → `master` via a `Release vX.Y.Z` PR
(`protect-master.yml` allows PRs to `master` only from `develop` or `release/*`), then push
the `vX.Y.Z` tag that triggers `.github/workflows/release.yml` (6 archives, plain +
`-with-csharp`). Includes an optional post-release `master → develop` sync.

**Version rule (encoded in the commands):** the `pre-commit` hook bumps the patch (+1) and
rebuilds **only on `feature/*` | `features/*` | `fix/*` branches**; `develop`/`master`/`release`/
`chore` get `cargo fmt` only. So the release version is fixed at the feature-branch commit and
carries forward unchanged through develop, master, and the tag. `/merge` therefore aborts unless
run from a feature/fix branch.

---

## Live Test Report — 2026-05-08

**Versie**: codesearch v1.0.93+416
Expand Down
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0



## [1.0.146] - 2026-06-02

### Added

- **Semantic Markdown chunking** — Markdown files (`.md`, `.markdown`, `.txt`) are
now parsed with the **tree-sitter-md block grammar**, so chunks align to sections,
headings, and code fences instead of arbitrary line ranges. `Language::Markdown`
now reports `supports_tree_sitter() == true` and has a compiled-in grammar.

### Changed

- **Supported-languages documentation corrected** — the README language table now
lists all 15 tree-sitter languages actually supported (Rust, Python, JavaScript,
TypeScript, C, C++, C#, Go, Java, Shell, Ruby, PHP, YAML, JSON, Markdown);
it previously showed only 9, omitting Shell, Ruby, PHP, YAML, JSON, and Markdown.

## [1.0.142] - 2026-06-01

### Fixed
Expand Down
13 changes: 12 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "codesearch"
version = "1.0.142"
version = "1.0.146"
edition = "2021"
authors = ["codesearch contributors"]
license = "Apache-2.0"
Expand Down Expand Up @@ -52,6 +52,7 @@ tree-sitter-ruby = "0.23.1"
tree-sitter-php = "0.24.2"
tree-sitter-yaml = "0.7.2"
tree-sitter-json = "0.24.8"
tree-sitter-md = "0.5.3"

# File handling
ignore = "0.4"
Expand Down
21 changes: 14 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ codesearch gives AI agents (OpenCode, Claude Code, Cursor, and any MCP client) d
- **Multi-repo serve mode**: Fan-out queries across repository groups with cross-repo RRF ranking
- **Hybrid retrieval**: Vector embeddings + BM25 full-text search fused with Reciprocal Rank Fusion
- **Symbol navigation**: Jump to definitions, find usages, trace imports and dependents — in the same tool
- **AST-aware chunking**: Tree-sitter parsing for 9 languages — chunks align to functions/classes, not arbitrary line ranges
- **AST-aware chunking**: Tree-sitter parsing for 15 languages — chunks align to functions/classes (and Markdown sections), not arbitrary line ranges
- **Token-efficient**: Returns metadata by default; agents fetch full code only when needed via `get_chunk`
- **Lightweight footprint**: Hundreds of MB on disk, runs on CPU only, no runtime model downloads (works behind enterprise proxies)
- **Zero config for single repos**: `codesearch index && codesearch mcp` — done
Expand Down Expand Up @@ -410,16 +410,23 @@ Tree-sitter AST-aware chunking:
| Language | Extensions |
|----------|-----------|
| Rust | `.rs` |
| Python | `.py` |
| JavaScript | `.js`, `.jsx` |
| TypeScript | `.ts`, `.tsx` |
| Python | `.py`, `.pyw`, `.pyi` |
| JavaScript | `.js`, `.mjs`, `.cjs` |
| TypeScript | `.ts`, `.tsx`, `.jsx`, `.mts`, `.cts` |
| C | `.c`, `.h` |
| C++ | `.cpp`, `.hpp` |
| C++ | `.cpp`, `.cc`, `.cxx`, `.hpp`, `.hxx` |
| C# | `.cs` |
| Go | `.go` |
| Java | `.java` |

All other text files use line-based chunking as fallback.
| Shell | `.sh`, `.bash`, `.zsh` |
| Ruby | `.rb`, `.rake` |
| PHP | `.php` |
| YAML | `.yaml`, `.yml` |
| JSON | `.json` |
| Markdown | `.md`, `.markdown`, `.txt` |

Markdown uses the tree-sitter-md **block** grammar — chunks align to sections,
headings, and code fences. All other text files use line-based chunking as fallback.

## Core Technology

Expand Down
20 changes: 18 additions & 2 deletions src/chunker/grammar.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,11 @@ impl GrammarManager {
Language::Php => Ok(tree_sitter_php::LANGUAGE_PHP.into()),
Language::Yaml => Ok(tree_sitter_yaml::LANGUAGE.into()),
Language::Json => Ok(tree_sitter_json::LANGUAGE.into()),
// Markdown uses the tree-sitter-md *block* grammar (sections, headings,
// code fences). The inline grammar is intentionally not used: chunk
// boundaries only need block structure, and the block grammar runs on a
// plain `Parser` like every other language here.
Language::Markdown => Ok(tree_sitter_md::LANGUAGE.into()),
_ => Err(anyhow!(
"Language {} does not support tree-sitter",
language.name()
Expand All @@ -96,6 +101,7 @@ impl GrammarManager {
Language::Php,
Language::Yaml,
Language::Json,
Language::Markdown,
]
}

Expand Down Expand Up @@ -251,10 +257,19 @@ mod tests {
}

#[test]
fn test_unsupported_language() {
fn test_load_markdown_grammar() {
let manager = GrammarManager::new();
let grammar = manager.get_grammar(Language::Markdown);

assert!(grammar.is_some());
}

#[test]
fn test_unsupported_language() {
let manager = GrammarManager::new();
// Toml has no compiled-in grammar.
let grammar = manager.get_grammar(Language::Toml);

assert!(grammar.is_none());
}

Expand Down Expand Up @@ -304,6 +319,7 @@ mod tests {
assert!(manager.is_supported(Language::Php));
assert!(manager.is_supported(Language::Yaml));
assert!(manager.is_supported(Language::Json));
assert!(!manager.is_supported(Language::Markdown));
assert!(manager.is_supported(Language::Markdown));
assert!(!manager.is_supported(Language::Toml));
}
}
3 changes: 2 additions & 1 deletion src/chunker/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -280,7 +280,8 @@ fn baz() {}
let mut parser = CodeParser::new();
let source = "some code";

let result = parser.parse(Language::Markdown, source);
// Toml has no compiled-in grammar, so parsing must fail.
let result = parser.parse(Language::Toml, source);
assert!(result.is_err());
}

Expand Down
Loading
Loading