punt-quarry

Local semantic search for AI agents and humans.

Quarry indexes documents in 20+ formats, embeds them with a local ONNX model (snowflake-arctic-embed-m-v1.5, 768-dim), stores vectors in LanceDB, and serves semantic search to Claude Code, Claude Desktop, and the CLI. Everything runs locally — no API keys, no cloud accounts. The embedding model (~120 MB int8) downloads once on first use. CUDA GPUs are auto-detected for faster inference.

Platforms: macOS, Linux

Quick Start

curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/1961675/install.sh | sh

Restart Claude Code, then:

> /ingest report.pdf                    # index a document (runs in background)
> /quarry status                        # after a moment, confirm it's there
> /find "what does the report say about margins"   # search by meaning

Once installed, a plugin hook auto-indexes your current project directory on every session start — you don't need to /ingest your codebase manually.

Manual install (if you already have uv)

uv tool install punt-quarry
quarry install
quarry doctor

Verify before running

curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/1961675/install.sh -o install.sh
shasum -a 256 install.sh
cat install.sh
sh install.sh

Remote Server

Run quarry on a GPU server and connect from any Mac or Linux client over TLS.

Server (GPU host, serves remote clients):

export QUARRY_API_KEY=$(openssl rand -hex 32)
curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/1961675/install.sh | sh -s -- --network

Generates TLS certificates, binds daemon to 0.0.0.0, registers a systemd service, and prints a CA fingerprint. NVIDIA GPUs are auto-detected for CUDA inference.

Client (connects to remote server):

curl -fsSL https://raw.githubusercontent.com/punt-labs/quarry/1961675/install.sh | sh
quarry login <server-hostname> --api-key <token>

No special flag needed --- the default install runs a local daemon on localhost. quarry login redirects queries to the remote server over wss:// with TOFU certificate pinning.

Claude Desktop

Download punt-quarry.mcpb and double-click to install. Alternatively, quarry install configures Claude Desktop automatically.

Note: Uploaded files in Claude Desktop live in a sandbox that quarry cannot access. Use remember for uploaded content, or provide local file paths to ingest.

Features

20+ formats --- PDFs (with OCR for scanned pages), source code (AST-aware splitting), spreadsheets, presentations, HTML, Markdown, LaTeX, DOCX, images
Semantic search --- retrieval is by meaning, not keyword. A query about "margins" finds passages about profitability even if they never use that word
Daemon architecture --- one quarry serve process loads the embedding model once and serves all Claude Code sessions via mcp-proxy over WebSocket
Passive knowledge capture --- SessionStart hook auto-indexes the working directory, PostToolUse hook auto-ingests fetched URLs, PreCompact hook captures transcripts before context compaction
Named databases --- isolated LanceDB directories with independent sync registries. Switch with use for work/personal separation
Research agent --- researcher subagent combines quarry local search with web research, auto-ingests valuable findings

What It Looks Like

Ingest a document

> /ingest report.pdf

▶ Ingesting report.pdf (background)

Check what's indexed

> /quarry

▶ Database: default
  Documents: 47
  Chunks: 1,203
  Size: 12.4 MB
  Model: snowflake-arctic-embed-m-v1.5 (768-dim)

Search by meaning

> /find "what were the Q3 revenue figures"

▶ [report.pdf p.12 | text/.pdf] (similarity: 0.4521)
  Third quarter revenue reached $142M, up 18% year-over-year,
  driven primarily by expansion in the enterprise segment.
  Gross margins improved to 71% from 68% in Q2.

Commands

Slash Commands (Claude Code)

Command	What it does
`/ingest <source>`	Ingest a URL, directory, or file
`/remember <name>`	Ingest inline text under a document name
`/find <query>`	Semantic search. Questions get synthesized answers; keywords get raw results
`/explain <topic>`	Search and synthesize an explanation
`/source <claim>`	Find which document a claim comes from
`/quarry [sub]`	Manage: `status`, `sync`, `collections`, `databases`, `registrations`

MCP Tools

Tool	Purpose	Execution
`ingest`	Index a file or URL	Background
`remember`	Index inline text	Background
`register_directory`	Register directory for sync	Background
`sync_all_registrations`	Re-index all registered directories	Background
`find`	Semantic search with filters	Sync
`show`	Document metadata or page text	Sync
`list`	Documents, collections, databases, registrations	Sync
`status`	Database statistics	Sync
`delete`	Remove document or collection	Background
`deregister_directory`	Remove registration	Background
`use`	Switch active database	Sync

CLI

quarry ingest report.pdf                       # index a file
quarry ingest https://example.com              # index a webpage
echo "notes" | quarry remember --name notes.md # index inline text
quarry find "revenue trends"                   # hybrid search (vector + FTS)
quarry list documents                          # list indexed documents
quarry register ~/Documents/notes              # watch a directory
quarry sync                                    # re-index registered dirs
quarry use work                                # switch database
quarry status                                  # database dashboard
quarry doctor                                  # health check
quarry serve                                   # start daemon on :8420
quarry install                                 # set up daemon, TLS certs, mcp-proxy

# Remote connections
quarry login okinos.local --api-key <token>    # TOFU login to remote server
quarry logout                                  # disconnect, revert to local daemon
quarry remote list --ping                      # show remote config and health

# Agent memory tagging
quarry ingest notes.md --agent-handle claude --memory-type fact
quarry find "deployment steps" --agent-handle claude
echo "key insight" | quarry remember --name insight.md --agent-handle claude \
  --memory-type observation --summary "Key insight from review"

Setup

Quarry works with zero configuration. These environment variables are available for customization:

Variable	Default	Description
`QUARRY_PROVIDER`	(auto)	ONNX execution provider: `cpu`, `cuda`, or unset (auto-detect)
`QUARRY_API_KEY`	(none)	Bearer token for `quarry serve`
`QUARRY_ROOT`	`~/.punt-labs/quarry/data`	Base directory for all databases
`CHUNK_MAX_CHARS`	`1800`	Max characters per chunk (~450 tokens)
`CHUNK_OVERLAP_CHARS`	`200`	Overlap between consecutive chunks

For the full configuration reference, see Architecture section 7.

Passive Knowledge Capture

Beyond explicit /ingest and /find commands, quarry runs as a Claude Code plugin with hooks that capture knowledge automatically during your sessions:

Hook	When it fires	What it does
Session start	On every session start	Auto-registers your project directory and syncs it in the background. Your codebase is searchable without manual ingestion.
Web fetch	After any `WebFetch` tool call	URLs Claude fetches during research are auto-ingested into a `web-captures` collection. Reuses already-retrieved content when available, falls back to URL ingest otherwise.
Pre-compact	Before context compaction	Captures the conversation transcript into a `session-notes` collection. Discoveries that would be lost when the context window shrinks are preserved as searchable chunks.

All hooks are fail-open — failures are ignored and never block Claude Code. Each hook is individually toggleable via .punt-labs/quarry/config.md YAML frontmatter. See AGENTS.md for the full integration model.

How It Works

Quarry runs as a daemon. Claude Code sessions connect through mcp-proxy:

                    stdio                       wss:// (TLS)
Claude Code <-----------------> mcp-proxy <---------------------> quarry serve
             MCP JSON-RPC       (~5 MB Go)      pinned CA cert    (one daemon)

Without the proxy, every session spawns a separate Python process, each loading the embedding model into ~200 MB of RAM. With it, startup is instant and state is shared across all sessions. All connections use TLS with a self-signed CA — even on localhost.

quarry install downloads mcp-proxy (SHA256-verified, correct platform) and configures MCP clients.

Documentation

Architecture | Z Specification | Design | Agents | Changelog

Development

uv sync                        # install dependencies
make check                     # run all quality gates (lint, type, test)
make test                      # test suite only
make format                    # auto-format code
make docs                      # build LaTeX documents

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 673 Commits
.beads		.beads
.claude-plugin		.claude-plugin
.claude		.claude
.ethos		.ethos
.github/workflows		.github/workflows
.lux		.lux
.punt-labs		.punt-labs
.vox		.vox
benchmarks		benchmarks
commands		commands
docs		docs
hooks		hooks
research		research
scripts		scripts
src/quarry		src/quarry
tests		tests
.biff		.biff
.env.example		.env.example
.envrc		.envrc
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.markdownlint-cli2.jsonc		.markdownlint-cli2.jsonc
.markdownlint.jsonc		.markdownlint.jsonc
.mcp.json		.mcp.json
.mcpbignore		.mcpbignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
DESIGN.md		DESIGN.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
install.sh		install.sh
prfaq.bib		prfaq.bib
prfaq.pdf		prfaq.pdf
prfaq.tex		prfaq.tex
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

punt-quarry

Quick Start

Remote Server

Claude Desktop

Features

What It Looks Like

Ingest a document

Check what's indexed

Search by meaning

Commands

Slash Commands (Claude Code)

MCP Tools

CLI

Setup

Passive Knowledge Capture

How It Works

Documentation

Development

License

About

Uh oh!

Releases 49

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

punt-quarry

Quick Start

Remote Server

Claude Desktop

Features

What It Looks Like

Ingest a document

Check what's indexed

Search by meaning

Commands

Slash Commands (Claude Code)

MCP Tools

CLI

Setup

Passive Knowledge Capture

How It Works

Documentation

Development

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 49

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages