Persistent knowledge base for Claude Code. Hooks extract facts, decisions, relationships, and more from conversations, store them in DuckDB with local embeddings, and inject relevant context into future sessions. Knowledge is scoped per git repo with automatic cross-project promotion.
One-command install:
curl -fsSL https://raw.githubusercontent.com/gilmanb1/ai-memory-db/master/install-remote.sh | bashOr clone and install:
git clone https://github.com/gilmanb1/ai-memory-db.git
cd ai-memory-db
bash install.shPrerequisites (install these first):
curl -LsSf https://astral.sh/uv/install.sh | sh # Python script runner
ollama pull nomic-embed-text # Local embedding model (768-dim)
export ANTHROPIC_API_KEY="sk-ant-..." # For knowledge extractionRestart Claude Code after installing — memory is now active. No manual steps during normal use.
Platform support: Linux, macOS, Windows (WSL2). Requires bash, python3 >= 3.11, uv, and ollama.
Session Start ──→ Inject long-term facts, guardrails, decisions as system context
Every Prompt ──→ 6-way semantic recall: relevant facts, error solutions, guardrails
Background ──→ Extract knowledge at 40%/70%/90% context usage + session end
Session End ──→ Auto-snapshot DB, final extraction, consolidation, decay pass
The system is fully automatic. Knowledge builds up over time. Short-term facts decay and are forgotten. Items seen across 3+ sessions promote from short → medium → long term. Items seen across 3+ projects auto-promote to global scope.
| Command | Description |
|---|---|
/remember <text> |
Store a fact in current project scope |
/remember global: <text> |
Store a fact visible to all projects |
/remember decision: <text> |
Store a decision |
/remember guardrail: <text> |
Store a guardrail (highest recall priority) |
/remember procedure: <text> |
Store a how-to procedure |
/remember error: <pattern> -> <fix> |
Store an error→solution pair |
| Command | Description |
|---|---|
/knowledge <topic> |
Cross-type search: facts, decisions, guardrails, entities, relationships |
/search-memory <query> |
Semantic search with --type and --scope filters |
/reflect <question> |
Agentic Q&A — iterates search/synthesis up to 6 rounds |
/recalled |
Show what context was injected for the last prompt |
/session-learned [id] |
Show what was extracted from a session |
| Command | Description |
|---|---|
/forget <text> |
Search and soft-delete a memory |
/review |
List/approve/reject flagged extraction items |
/audit-memory |
Quality report: stale facts, contradictions, orphaned entities |
/memory-health |
System health: Ollama, DB locks, snapshots, embeddings, disk |
| Command | Description |
|---|---|
/memories |
Database statistics |
/facts |
List facts (--class long, --limit 20) |
/decisions |
List decisions |
/entities |
List known entities |
/relationships |
Show entity graph |
/sessions |
List sessions with summaries |
/scopes |
List project scopes and item counts |
| Command | Description |
|---|---|
/snapshots |
List available DB snapshots |
/export-memory |
Export to portable JSON (--output path --scope X) |
/import-memory <path> |
Import from JSON export |
/restore-memory <snapshot> |
Roll back to a snapshot |
| Type | Description | Recall Priority |
|---|---|---|
| Guardrails | "Don't modify X without Y" — protective rules | Highest (always surfaced) |
| Error Solutions | Known error→fix pairs | High |
| Procedures | Step-by-step how-to instructions | High |
| Facts | Technical, architectural, operational knowledge | Medium |
| Decisions | Why X was chosen over Y | Medium |
| Observations | Consolidated insights from multiple facts | Medium |
| Entities | Named concepts, people, technologies | Low |
| Relationships | Entity connections (uses, depends_on, etc.) | Low |
Knowledge is isolated per git repository. Facts from project A never leak into project B's context.
- Scope resolution:
git rev-parse --show-toplevelidentifies the project - Recall priority: Project-local items first, global items fill remaining budget
- Auto-promotion: Items seen in 3+ projects promote to
__global__scope - Multi-scope: Set
MEMORY_ADDITIONAL_SCOPES=/other/repofor cross-repo sessions (e.g.,--add-dir)
Every prompt triggers 6-way parallel retrieval, fused via Reciprocal Rank Fusion:
| Strategy | What It Finds | Speed |
|---|---|---|
| Semantic | Embedding cosine similarity across 9 tables | ~50ms |
| BM25 | Keyword/full-text search | ~10ms |
| Graph | Entity traversal (2-hop BFS through relationships) | ~20ms |
| Temporal | Date-range filtering ("last week", "yesterday") | ~5ms |
| Path | File-path matching via fact_file_links | ~10ms |
| Code | Symbol/dependency graph matching | ~15ms |
Results are fused, ranked, and capped at the token budget (4000 tokens per prompt, 3000 per session).
The memory system includes an MCP (Model Context Protocol) server that gives any MCP-compatible AI tool on-demand read/write access to the knowledge base. This works alongside the hooks — hooks are automatic and passive, the MCP server is on-demand and active.
Tools exposed:
| Tool | Description |
|---|---|
memory_search |
Search across facts, guardrails, procedures, error_solutions, and decisions by semantic query. Accepts types filter and limit. |
memory_store |
Store a fact, decision, guardrail, procedure, or error_solution. Accepts text, type, importance (1-10), and file_paths. |
memory_guardrail |
Create a guardrail with warning, rationale, consequence, and file_paths. |
memory_check_file |
Get all guardrails, procedures, facts, and error_solutions linked to a specific file path. |
Setup: The installer registers the MCP server in settings.json. It runs via stdio (JSON-RPC 2.0) — no separate process to manage.
Compatibility: Works with Claude Code, Cursor, Windsurf, Cline, and any other MCP-compatible coding agent. All agents share the same DuckDB knowledge base.
Example usage (Claude calls these automatically when relevant):
memory_search(query="retry logic", types=["facts", "guardrails"], limit=5)
memory_store(text="API rate limit is 1000 req/min", type="fact", importance=7)
memory_guardrail(warning="Don't modify auth.py without tests", file_paths=["src/auth.py"])
memory_check_file(file_path="src/auth.py")
| Feature | Description |
|---|---|
| Extraction validation | Rejects bare URLs, meta-commentary, low-confidence items. Flags borderline items for /review |
| Correction detection | Detects "that's wrong" / "actually it's..." and auto-supersedes bad facts |
| Guardrail enforcement | PostToolUse hook detects edits to guardrailed files, auto-stashes via git |
| Guardrail promotion | Facts with directive language (always/never) + high reinforcement are proposed as guardrails |
| Auto-snapshots | DB snapshot on every session end (rotating last 5) |
| Truncation visibility | Stderr reports when items are truncated; footer in injected context |
| DuckDB concurrency | Retry with exponential backoff, per-process init caching, read-only optimization |
Deploy a read-only demo dashboard with sample data:
# Docker (local)
python3 demo/seed_demo_db.py # Generate sample knowledge base
docker build -t ai-memory-db-demo .
docker run -p 8080:8080 ai-memory-db-demo
# Render (one-click)
# Push to GitHub, connect repo in Render dashboard — uses render.yamlThe demo uses the test corpus (57 facts, 31 entities, 15 decisions, 8 guardrails across 3 projects) with mock embeddings. No Ollama or API key needed.
A full-featured Next.js dashboard for exploring and managing the knowledge base:
# Start backend
cd dashboard/backend && uvicorn server:app --port 9111
# Start frontend
cd dashboard/frontend && npm run devFeatures:
- Per-scope filtering — dropdown in sidebar filters all pages by project
- Knowledge graph — unified multi-type graph with clustering, heatmap mode, and chat
- Review queue — approve/reject flagged extraction items
- CRUD on all item types — facts, decisions, guardrails, procedures, error solutions
- Entity/relationship graph — colored by type, sized by degree, clustered by edge type
- Code graph — file dependency visualization with symbol details
curl -fsSL https://raw.githubusercontent.com/gilmanb1/ai-memory-db/master/install-remote.sh | bashThis clones the repo to a temp directory, runs the installer, and cleans up. To install a specific version:
AI_MEMORY_DB_VERSION=v1.0.0 curl -fsSL https://raw.githubusercontent.com/gilmanb1/ai-memory-db/master/install-remote.sh | bashgit clone https://github.com/gilmanb1/ai-memory-db.git
cd ai-memory-db
bash install.sh # Install globally (all Claude Code sessions)
bash install.sh --project # Or install for current project only| Dependency | Install | Purpose |
|---|---|---|
| Python 3.11+ | System package manager | Runtime |
| uv | curl -LsSf https://astral.sh/uv/install.sh | sh |
Inline script deps |
| Ollama | ollama pull nomic-embed-text |
Local embeddings (768-dim) |
ANTHROPIC_API_KEY |
console.anthropic.com | Knowledge extraction |
Optional: Install onnxruntime + transformers for in-process embeddings (~3ms, no Ollama needed).
- Copies
memory/package to~/.claude/memory/ - Copies 15 hook scripts to
~/.claude/hooks/ - Copies 22 slash commands to
~/.claude/commands/ - Configures hooks, PostToolUse, and status line in
settings.json - Runs the test suite to verify installation
cd ai-memory-db && git pull && bash install.shOr re-run the one-liner — it always fetches the latest.
- macOS/Linux: Works out of the box
- WSL2: Works as-is (
$HOME/.claude/maps correctly) - Windows (native): Not supported — use WSL2
Key settings in memory/config.py (83 configurable parameters):
OLLAMA_MODEL = "nomic-embed-text" # Embedding model (768-dim)
CLAUDE_MODEL = "claude-sonnet-4-6" # Extraction LLM
DEDUP_THRESHOLD = 0.92 # Cosine >= this → reinforce, don't duplicate
RECALL_THRESHOLD = 0.60 # Cosine >= this → relevant for recall
SESSION_TOKEN_BUDGET = 3000 # Max tokens for session context
PROMPT_TOKEN_BUDGET = 4000 # Max tokens per-prompt context
DECAY_RATES = {"short": 0.18, "medium": 0.04, "long": 0.007} # Per-day exponential
AUTO_PROMOTE_PROJECT_COUNT = 3 # Seen in N+ projects → global
EXTRACTION_THRESHOLDS = [40, 70, 90] # Context % triggers for extraction
GUARDRAIL_ENFORCEMENT_ENABLED = True # PostToolUse guardrail checking
SNAPSHOT_ON_SESSION_END = True # Auto-snapshot DBpython3 test_memory.py # Full suite
python3 -m pytest test_memory.py -k "TestCorpus" -v # Corpus tests only809 tests covering:
- Unit tests with mock embeddings (fast, no external deps)
- Integration tests exercising full hook pipelines
- Realistic corpus tests (hand-crafted 6-week corpus + scaled 1d/1w/1m/1y corpuses)
- Real ONNX embedding tests (semantic similarity verification)
- Cross-repo scope isolation tests (22 tests proving no contamination)
- Concurrency tests (multi-process DuckDB lock handling)
memory/ Python package
config.py 83 configurable constants
db.py DuckDB schema (11 migrations), CRUD, vector search
embeddings.py ONNX (primary) + Ollama (fallback) embeddings
extract.py Claude API extraction via tool_use
ingest.py Multi-pass extraction pipeline with validation
recall.py Session + prompt recall with token budgets
retrieval.py 6-way parallel retrieval with RRF fusion
scope.py Git-based scope resolution + multi-scope
decay.py Temporal scoring and forgetting
consolidation.py Observation synthesis + coherence checking
communities.py Entity clustering via union-find
validation.py Extraction quality gates + review queue
corrections.py User correction detection + auto-supersede
backup.py Snapshots, export/import, restore
guardrail_check.py File-edit guardrail enforcement
guardrail_promotion.py Auto-detect facts that should be guardrails
cli.py CLI (python -m memory)
code_graph.py Tree-sitter code parsing + symbol graph
routing.py /remember classification + routing
hooks/ Claude Code hook scripts (15 hooks)
commands/ Slash command definitions (22 commands)
dashboard/ Web dashboard (FastAPI backend + Next.js frontend)
test_memory.py 809 tests
test_corpus.py Hand-crafted 6-week corpus fixture
test_corpus_scaled.py Scaled corpus generator (1d/1w/1m/1y)
test_embeddings_cache.py ONNX embedding cache for tests
install.sh Installer script
| Condition | Behavior |
|---|---|
| Ollama down | Falls back to ONNX embeddings (if installed). If both unavailable, BM25 keyword search only. |
| ONNX unavailable | Falls back to Ollama HTTP. If both down, embedding features disabled. |
| Anthropic API fails | Single retry with 2s delay. Extraction skipped — recall still works. |
| No ANTHROPIC_API_KEY | Extraction disabled. Recall works from existing DB. |
| No database yet | All hooks exit cleanly. DB created on first extraction. |
| DuckDB locked | Retry with exponential backoff (5 attempts, 150ms-2.4s). |
| Token budget exceeded | Lower-priority items truncated. Stderr reports what was dropped. |