-
-

-
-
-

-
+
+ {/* ── Manifesto ── */}
+
+
Reason, don't vector.
+
+ Knowing by reasoning, not vectors.
+
-
Knowing by reasoning, not vectors.
-
- Deep and reliable. Vectorless plays nicely with your documents.
- Ask questions in plain language; get answers by reasoning with Vectorless.
-
-
Installation
-
- Install using pip install -U vectorless. For more details,
- see the{' '}
- Installation section in the
- documentation.
-
+ {/* ── The Problem ── */}
+
+
+ Deep and reliable. Vectorless plays nicely with your documents.
+ Ask questions in plain language; get answers by reasoning.
+
+
-
A Simple Example
-
- {CODE_EXAMPLE}
-
+ {/* ── Two Products ── */}
+
+
-
Help
-
- See{' '}
- documentation for more
- details.
-
+
+
Core Engine
+
vectorless
+
+ A reasoning-based document understanding engine for AI.
+ Compile documents into a rich IR, query with an agent that navigates and reasons.
+ Zero embedding dependency.
+
+
+ For AI engineers building retrieval systems.
+
+
+ pip install vectorless
+
+
+ Documentation
+ GitHub
+
+
-
Contributing
-
- Contributions welcome! See{' '}
-
- Contributing
- {' '}
- for setup and guidelines.
-
+
+
Application
+
vectorless-code
+
+ AI code search for your entire codebase.
+ CLI + MCP server that plugs into Cursor, Claude Code, or any AI coding tool.
+ No vector DB, no embedding model — just compile and search.
+
+
+ For developers who search code every day.
+
+
+ pip install vectorless-code
+
+
+ Learn more
+ GitHub
+
+
-
License
-
Apache License 2.0
+
+
+
+ {/* ── How It Works ── */}
+
+ How it works
+
+
+
1
+
+ Compile.{' '}
+ Parse your documents (or codebase) into a rich intermediate representation —
+ a navigable tree with keyword indexes, routing tables, and evidence scores baked in. No LLM required.
+
+
+
+
2
+
+ Reason.{' '}
+ An AI agent navigates the tree like a human expert —
+ ls to explore, cd to dive deeper,
+ cat to read, find to search.
+ It reasons about which path leads to the answer.
+
+
+
+
3
+
+ Answer.{' '}
+ The agent collects evidence with full source attribution —
+ section title, node path, line numbers. Every claim is traceable.
+
+
+
+
+
+ {/* ── Open Source ── */}
+
+
+
+
+ GitHub
+
+
+ Get Started
+
+
+
-
diff --git a/docs/vectorless-code.md b/docs/vectorless-code.md
new file mode 100644
index 0000000..76e1bb0
--- /dev/null
+++ b/docs/vectorless-code.md
@@ -0,0 +1,302 @@
+# vectorless-code:基于树遍历的代码搜索
+
+## 1. 现有工具分析
+
+### cocoindex-code
+
+给 AI 编码助手用的**代码语义搜索引擎**。
+
+```
+源码 → 分块(~1000字符) → embedding向量 → sqlite-vec
+查询 → query embedding → 余弦相似度 → top-k代码块
+```
+
+- 依赖:嵌入模型 + 向量数据库
+- 搜索速度:~100ms
+- 擅长:语义相似匹配("login" 能匹配 "authenticate")
+- 不擅长:复杂推理查询("认证流程怎么走")
+
+### codeindex
+
+和 vectorless 思路相同的代码搜索工具(TypeScript 实现)。
+
+```
+源码 → 解析符号 → 构建树(Project>Module>File>Symbol) → LLM生成摘要
+查询 → LLM逐层遍历(module→file→symbol, 3次调用) → 返回代码
+```
+
+- 依赖:LLM(无 embedding、无向量 DB)
+- 索引速度:中(LLM 生成每层摘要)
+- 搜索速度:~5-10s(3 次 LLM 调用)
+- 擅长:精准定位(LLM 理解语义选择节点)
+- 验证了"慢但准"的路线可行
+
+---
+
+## 2. vectorless-code 方案
+
+### 核心思路
+
+复用 vectorless 的编译管线 + 树结构,实现三层查询策略:
+
+| 模式 | 方法 | 速度 | 覆盖场景 |
+|---|---|---|---|
+| **Fast** | ReasoningIndex 关键词匹配 | ~10ms | 精确查询(函数名、变量名) |
+| **标准** | codeindex 式逐层遍历(3次LLM) | ~5s | 语义查询("认证逻辑在哪") |
+| **Deep** | Worker Agent 推理导航 | ~30s | 复杂查询("认证流程怎么走") |
+
+### 查询流程
+
+```
+查询 "authentication logic"
+ │
+ ├─ Step 1: 关键词匹配(~10ms)
+ │ extract_keywords → 查 ReasoningIndex
+ │ 命中 → 返回节点,结束
+ │
+ ├─ Step 2: 逐层遍历(~5s, 3次LLM)
+ │ Level 1: "这8个目录哪些相关?" → LLM 选 2-3 个
+ │ Level 2: "这20个文件哪些相关?" → LLM 选 3-5 个
+ │ Level 3: "这些代码块哪些相关?" → LLM 选 5-10 个
+ │ → 返回,结束
+ │
+ └─ Step 3: Worker 推理(~30s, 6-15次LLM)
+ 完整 ls/cd/cat/find/grep 导航
+ → 返回带溯源的证据
+```
+
+### 三个工具对比
+
+| | cocoindex-code | codeindex | vectorless-code |
+|---|---|---|---|
+| **方法** | Embedding 向量搜索 | LLM 逐层遍历 | 关键词 + 逐层遍历 + Worker |
+| **依赖** | 嵌入模型 + 向量DB | 仅 LLM | 仅 LLM(Fast 模式连 LLM 都不需要) |
+| **索引** | 慢(算 embedding) | 中(LLM 生成摘要) | 快(Fast 编译 0 LLM) |
+| **搜索速度** | ~100ms | ~5-10s | ~10ms / ~5s / ~30s |
+| **语义理解** | 好(向量语义) | 好(LLM 理解) | 好(LLM 理解) |
+| **深度查询** | 不支持 | 有限(3层遍历) | 支持(Worker 推理) |
+| **精确匹配** | 一般(模糊) | 好(LLM 选择) | 好(关键词精确 + LLM 选择) |
+| **跨语言** | 所有语言 | 9种(有语言适配器) | 所有语言(通用分块) |
+
+### 架构
+
+```
+源码文件 (*.rs, *.py, *.ts, ...)
+ │
+ ▼
+┌──────────────────────────────────────────┐
+│ Code Parser(通用分块) │
+│ file → Vec