Local-first memory and prompt-cache layer for Claude Code.
Somtum (Thai: ส้มตำ) is named after the vibrant, shredded green papaya salad. Just like its namesake, Somtum blends durable observations from your Claude Code sessions — decisions, bugfixes, learnings, file summaries — stores them in a local SQLite database, and injects the relevant ones back into context the next time you work on the same project.
Zero-config: one somtum init and every session end is captured automatically. No server, no cloud account, no mandatory tuning.
- Why Somtum?
- How it works
- Requirements
- Install
- Quickstart
- Verifying the setup
- Dashboard
- CLI Reference
- MCP Server
- Storage Layout
- Configuration
- Privacy
- Token Accounting
- Performance
- Development
- Troubleshooting
- License
v2.0.0 — Global DB (
~/.somtum/global.db) · cross-project workspace recall · memory deduplication (superseded_by) ·--show-supersededonsomtum list· stats instrumentation fix · dashboard dark-mode redesignv1.5.0 — Multi-page VitePress docs site ·
somtum list·somtum reset·somtum forget --all· embeddings timeout safety · config crash-resilience ·injection.max_charswired up · warm-start race fix · auth-error hintsv1.3.0 — Auto-inject memories on every prompt ·
updateMCP tool · warm-start after compaction · false-hit detection · workspace scope ·suggest-claude-md· stale memory detection indoctor
LLM agents like Claude Code start every session with a blank slate. That leads to:
- Repetitive context — re-explaining the same architectural choices every session
- Regressions — Claude suggests a fix you already tried and discarded
- Token waste — reading large files just to "set the scene"
Somtum gives Claude a long-term memory. Once a decision is made or a bug is fixed, it's remembered across all future sessions — without bloating your context window.
Without Somtum With Somtum
──────────────────── ──────────────────────────────────────
Session 1: "We use pnpm Session 1: same work
because of workspace
hoisting"
Session 2: Claude suggests Session 2: Claude already knows about
npm, you correct it pnpm, the auth decisions, and the
again bugfixes from last week
At the end of each Claude Code session, Somtum reads the session transcript and asks Claude Haiku to extract the parts worth keeping — decisions, bug fixes, things learned. Those observations are stored locally in SQLite. On every subsequent prompt, Somtum automatically retrieves the most relevant memories and injects them into context — no manual recall needed.
┌─────────────────────────────────────────────────────────────┐
│ Claude Code Session │
│ │
│ you code · debug · review · make decisions │
└──────────────────────────────┬──────────────────────────────┘
│ SessionEnd / PreCompact
▼
┌─────────────────────────────────────────────────────────────┐
│ Capture Pipeline │
│ │
│ session transcript ──► Haiku extracts observations │
│ │
│ decisions · bug fixes · learnings · commands │
│ │
│ PreCompact ─── writes warm-start file ──► next session │
└──────────────────────────────┬──────────────────────────────┘
│ persisted locally
▼
┌─────────────────────────┐
│ ~/.somtum/projects/ │
│ <project-hash>/ │
│ │
│ db.sqlite │
│ index.md │
│ memories/YYYY-MM/ │
└────────────┬────────────┘
│ every prompt (UserPromptSubmit)
▼
┌─────────────────────────────────────────────────────────────┐
│ Auto-Inject Pipeline (new) │
│ │
│ 1. Prompt cache lookup (exact + fuzzy match) │
│ 2. BM25 recall — top-k relevant memories │
│ 3. Warm-start context (if session just compacted) │
│ │
│ all injected as additionalContext automatically │
└─────────────────────────────────────────────────────────────┘
You work a session debugging an auth bug and refactoring a module. At session end, Somtum extracts something like:
[
{
"kind": "bugfix",
"title": "JWT refresh loop caused by missing expiry check",
"body": "The refresh token loop was triggered because we checked token.exp < Date.now() instead of token.exp < Date.now() / 1000. Unix timestamps are in seconds, not milliseconds.",
"files": ["src/auth/refresh.ts"]
},
{
"kind": "decision",
"title": "Use pnpm workspaces — npm hoisting breaks shared types",
"body": "Switched from npm to pnpm because npm's hoisting puts shared type packages in the wrong node_modules scope, breaking type inference across packages.",
"files": ["package.json", "pnpm-workspace.yaml"]
}
]Next session, when you ask "why are we using pnpm?" or touch src/auth/refresh.ts, Claude finds these memories and already has the context.
┌─────────────────────────────────────────────────────────────┐
│ Claude Code / Agent │
└──────────┬──────────────────────────────┬───────────────────┘
│ hooks │ MCP tools
▼ ▼
┌─────────────────────┐ ┌──────────────────────────┐
│ Hooks │ │ MCP Tools │
│ │ │ │
│ UserPromptSubmit ──┼─cache──▶│ cache_lookup │
│ ──┼─inject─▶│ recall / get │
│ SessionEnd ────────┼─capture▶│ remember / update │
│ PreCompact ────────┼─warmst─▶│ forget │
│ PreToolUse (Read) ─┼─gate───▶│ stats │
│ │ │ report_false_hit │
└──────────┬──────────┘ └────────────┬─────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ Core (TypeScript) │
│ │
│ ┌──────────────┐ ┌─────────────────┐ ┌───────────────┐ │
│ │ PromptCache │ │ MemoryStore │ │ Retriever │ │
│ │ │ │ │ │ │ │
│ │ exact hash │ │ observations │ │ bm25(default) │ │
│ │ fuzzy embed │ │ scope: project │ │ embeddings │ │
│ │ fingerprint │ │ global │ │ index │ │
│ │ false_hits │ │ workspace │ │ hybrid │ │
│ └──────────────┘ │ last_confirmed │ └───────────────┘ │
│ └─────────────────┘ │
└─────────────────────────────────┬───────────────────────────┘
│
▼
┌─────────────────────────────┐
│ SQLite WAL + ~/.somtum/ │
│ /projects/<hash>/ │
│ db.sqlite │
│ index.md │
│ memories/YYYY-MM/<ulid>.md│
│ /session/lh_<id>.json │
│ /warmstart/ws_<id>.json │
└─────────────────────────────┘
| Strategy | How it works | Best for | Cost |
|---|---|---|---|
bm25 |
Keyword search over title + body + tags (SQLite FTS5 — no external dependencies) | Exact terms, offline setups | Near-zero |
embeddings |
Semantic similarity using a 30 MB local model (bge-small-en-v1.5, runs fully in-process) | "What did we decide about auth?" style queries | ~5 ms at 10k memories |
index |
Sends a compact memory catalog to Haiku; the model picks relevant IDs | Paraphrased or fuzzy queries | 1 Haiku API call |
hybrid |
BM25 + embeddings results merged and re-ranked by Haiku | General case (best recall) | BM25 + embeddings + 1 Haiku call |
Default is bm25 — works offline, no setup. Enable hybrid once you have embeddings downloaded.
Caution: Setting
strategy=hybridwithout enabling embeddings causes a silent fallback to BM25 while paying hybrid overhead. Runsomtum doctor— if it showsstrategy=hybridalongsideembeddings: disabled, fix it:somtum config set retrieval.strategy bm25 # match what's actually running # or, to use real hybrid: somtum config set retrieval.embeddings.enabled true && somtum reindex
- Node 20+
- Claude Code — Somtum hooks into Claude Code's
SessionEnd,UserPromptSubmit, andPreToolUseevents ANTHROPIC_API_KEY(optional) — if set, Somtum uses the Anthropic API directly for extraction. If not set, Somtum falls back to theclaudeCLI that ships with Claude Code, so no separate API key is required for Claude Code subscribers.
npm install -g somtumpnpm users:
pnpm add -g somtumworks if you have runpnpm setupfirst (setsPNPM_HOME). If you haven't, use npm above.yarn users:
yarn global addis not supported in Yarn v2+ (Berry). Use npm above.
git clone https://github.com/riz007/somtum
cd somtum
pnpm install
pnpm build
pnpm link --globalSomtum uses better-sqlite3, which contains a native C++ addon. On most platforms (macOS, Linux x64/arm64, Windows x64) a prebuilt binary is downloaded automatically. On Alpine Linux / musl or unusual architectures, the addon compiles from source — python, make, and gcc must be available. If the install fails with a node-gyp error, install those build tools and retry.
Using Claude Code (subscription)?
You do not need an API key. Somtum calls the claude CLI that ships with Claude Code.
Confirm it is installed: which claude
If nothing prints, reinstall Claude Code or add its binary to your PATH.
Using the Anthropic API directly? Add your key to your shell profile (not just the current terminal tab):
# Add to ~/.zshrc or ~/.bashrc
export ANTHROPIC_API_KEY="sk-ant-..."
source ~/.zshrcBoth paths work. Most Claude Code users should use Option A.
Somtum needs to call a Claude model at session end to extract observations. Pick one:
Option A: Claude Code subscription (no extra setup)
If you already have Claude Code installed, you're done. Somtum calls claude --print automatically when no API key is present. Skip to Step 2.
Option B: Direct Anthropic API key (optional — faster, lets you pick the model)
Add to ~/.zshrc (or ~/.bashrc):
export ANTHROPIC_API_KEY="sk-ant-..."Then reload:
source ~/.zshrcThe key must be in your shell profile, not just exported in an open terminal tab. The
SessionEndhook inherits the environment of the shell that started Claude Code — not the current terminal.
Run this from the root of the project you work on with Claude Code:
somtum initTo enable all features at once (recommended):
somtum init --all
# Installs:
# - SessionEnd capture hook (memory extraction)
# - UserPromptSubmit cache hook (prompt cache lookup)
# - PreToolUse file-gating hook (large file summarization)
# - MCP server in .mcp.json (Claude can call recall/remember tools)Run doctor immediately after init to confirm everything is connected:
somtum doctorAll checks should show ✓. The two most important are:
api_key— confirms Somtum has a way to call Claude for extractionhooks_installed— confirms the SessionEnd hook is registered
Do not start your first session until all checks pass. Each failing check shows an inline fix command.
Open a Claude Code session from the same directory where you ran somtum init. Work as you normally would. When the session ends, the hook extracts observations automatically in the background (capped at 90 seconds).
# How many observations were captured?
somtum stats
# Search memory
somtum search "auth jwt rotation"
somtum search "why we use pnpm" --strategy hybrid
# Open the visual dashboard
somtum serveIf somtum stats shows memories 0 after a session, see Troubleshooting.
After your first Claude Code session ends:
1. Check the hook log
cat ~/.somtum/hook.logA successful run:
2026-04-30T10:15:42.123Z [post_session] starting
2026-04-30T10:15:44.891Z [post_session] ok — inserted=4 superseded=1 cache=2 summaries=1
Using the claude CLI fallback (no API key):
2026-04-30T10:15:42.123Z [post_session] starting
2026-04-30T10:15:42.124Z [post_session] ANTHROPIC_API_KEY not set — will use claude CLI fallback
2026-04-30T10:15:44.891Z [post_session] ok — inserted=4 superseded=1 cache=2 summaries=1
Neither backend available:
2026-04-30T10:15:42.123Z [post_session] ERROR: Neither ANTHROPIC_API_KEY nor the claude CLI is available.
2. Check stats
somtum statsYou should see memories > 0 after a substantive session. Short or trivial sessions (no decisions, no bug fixes) correctly return 0 — the extractor only keeps durable observations.
3. Run doctor
somtum doctorAll checks should show ✓. The api_key and hooks_installed checks are the two most commonly failing.
somtum serve
# Opens http://localhost:3000The dashboard has four views:
- Memory browser — searchable, filterable list of all captured observations. Switch between BM25, hybrid, and embeddings strategies live. Click any memory to see its full body, files, and tags.
- Knowledge graph — nodes are memories, edges connect memories that share files or tags. Click a node to open it in the detail panel.
- Analytics — kind breakdown, cache hit rate, retrieval strategy usage, top-referenced files.
- Forget button — soft-delete any memory directly from the browser.
| Flag | Default | Description |
|---|---|---|
--port <n> |
3000 | Listen on a custom port |
--no-open |
— | Start server without opening the browser |
Press Ctrl-C to stop.
| Command | Description |
|---|---|
somtum init |
Install the SessionEnd capture hook |
somtum init --cache |
Also install the UserPromptSubmit cache + auto-inject hook |
somtum init --file-gating |
Also install the PreToolUse file-gating hook |
somtum init --all |
Install all hooks + MCP server |
somtum init --force |
Reinstall even if hooks already present |
somtum doctor |
Check DB health, migrations, hooks, API key, breakeven ratio, stale memories |
| Command | Description |
|---|---|
somtum list |
List stored memories (most recent first, superseded hidden by default) |
somtum list --kind decision |
Filter by kind: decision | learning | bugfix | command | file_summary |
somtum list --limit 20 |
Limit to 20 results |
somtum list --json |
Machine-readable JSON output |
somtum list --show-superseded |
Include memories that have been superseded by a newer duplicate |
somtum search <query> |
Search observations (default: bm25 strategy) |
somtum search <query> --strategy hybrid |
Force a specific retrieval strategy |
somtum search <query> -k 16 |
Return more results |
somtum show <id> |
Print the full body of an observation |
somtum remember |
Manually store an observation |
somtum forget <id> |
Soft-delete an observation by id |
somtum forget --all |
Soft-delete all observations in the current project |
somtum edit <id> |
Open an observation body in $EDITOR |
somtum rebuild |
Regenerate index.md from all observations |
somtum reindex |
Recompute embeddings (after enabling embeddings or changing model) |
somtum suggest-claude-md |
Suggest CLAUDE.md additions from accumulated observations (interactive) |
somtum suggest-claude-md --dry-run |
Preview suggestions without writing |
somtum suggest-claude-md --yes --limit 20 |
Auto-confirm, limit to top 20 by tokens saved |
| Command | Description |
|---|---|
somtum stats |
Tokens saved, cache hit rate, retrieval breakdown |
somtum stats --json |
Machine-readable JSON output |
somtum serve |
Open the visual dashboard in the browser |
somtum serve --port <n> |
Use a custom port (default 3000) |
somtum serve --no-open |
Start server without opening the browser |
| Command | Description |
|---|---|
somtum export |
Export observations to stdout as JSON |
somtum export --format jsonl --output obs.jsonl |
Export as JSONL file |
somtum export --format markdown |
Export as readable Markdown |
somtum export --include-deleted |
Include soft-deleted entries |
somtum import <file> |
Import observations from JSON or JSONL |
somtum purge --older-than 30d |
Hard-delete soft-deleted entries older than 30 days |
somtum purge --older-than 30d --dry-run |
Preview without deleting |
somtum reset |
Permanently wipe all memories for the current project (asks to confirm) |
somtum reset --yes |
Skip confirmation (useful in CI or scripts) |
| Command | Description |
|---|---|
somtum config get |
Print the full resolved config |
somtum config get retrieval.strategy |
Read a single key (dot-separated) |
somtum config set retrieval.strategy hybrid |
Write to .somtum/config.json |
somtum config set retrieval.embeddings.enabled true --global |
Write to ~/.somtum/config.json |
| Command | Description |
|---|---|
somtum sync status |
Compare local vs remote observation count |
somtum sync push |
Export and scp observations to remote |
somtum sync pull |
scp from remote and merge into local DB |
Set your remote: somtum config set sync.remote "user@host:/path/.somtum/projects/<id>"
Somtum uses hostname-aware syncing — merging observations from multiple machines without data loss.
When you run somtum init --all, Somtum registers an MCP server that Claude can call autonomously during a session:
| Tool | What Claude does with it |
|---|---|
recall |
Searches memories when unsure about a project detail. Merges project + global DB results. Accepts strategy |
get |
Retrieves full observation bodies by ID. Checks global DB if not found in project DB. Bumps last_confirmed_at |
remember |
Stores an observation manually. scope='global' routes to ~/.somtum/global.db; returns stored_in field |
update |
Updates an existing observation's title, body, tags, or files. Redaction applied |
cache_lookup |
Checks the prompt cache directly |
report_false_hit |
Reports that a cached response didn't answer the question (tunes fuzzy threshold data) |
forget |
Soft-deletes an observation |
stats |
Reports tokens saved, cache hit rate, false-hit count, corpus size, and global_memories count |
Every MCP response includes a tokens field so Claude can account for retrieval cost.
Observations now carry a scope field:
| Scope | Meaning | Use it when |
|---|---|---|
project |
Default. Visible only in this project. | Most decisions, bugfixes, and learnings. |
workspace |
Shared across projects via the recall MCP tool. |
Team conventions, preferred libraries, global rules. |
global |
Same as workspace; reserved for personal preferences that span all your projects. | Your personal coding preferences. |
# Store a workspace-scoped observation from within a session:
remember("Always use pnpm for Node projects", body="...", scope="workspace")
~/.somtum/
├── config.json ← global config (merged with project config)
├── hook.log ← timestamped log of every hook execution
├── global.db ← global-scope memories (scope='global'), queried
│ alongside every project DB on recall + auto-inject
├── session/
│ └── lh_<id>.json ← last cache-hit state per project (false-hit detection)
│ files older than 24 h are evicted automatically
├── warmstart/
│ └── ws_<id>_<timestamp>.json ← warm-start context written after PreCompact (30 min TTL)
│ timestamped so concurrent windows don't clobber each other
└── projects/
└── <project_id>/
├── db.sqlite ← source of truth (SQLite WAL)
├── index.md ← human-readable mirror (regenerated)
└── memories/
└── YYYY-MM/
└── <ulid>.md ← per-observation markdown files
The project ID is derived from the git remote URL (or directory path if no remote). The same project maps to the same ID across machines as long as the remote URL matches.
SQLite is the source of truth. Edit observations with somtum edit <id>, not by hand.
Global config lives at ~/.somtum/config.json. Per-project config at .somtum/config.json overrides it (deep merge).
# Enable semantic (embedding-based) search — downloads a 30 MB model once
somtum config set retrieval.embeddings.enabled true
somtum reindex
# Switch to hybrid retrieval (BM25 + embeddings + rerank) for best recall
somtum config set retrieval.strategy hybrid
# Use LLM-based retrieval (no embeddings required, costs one Haiku call per query)
somtum config set retrieval.index.enabled true
somtum config set retrieval.strategy index
# Disable file-gating (on by default — intercepts large file reads and serves cached summary)
somtum config set file_gating.enabled false
# Limit observations extracted per session (default: 10)
somtum config set extraction.max_observations_per_session 5
# Control automatic memory injection on every prompt (default: on)
somtum config set injection.enabled false # turn off auto-inject
somtum config set injection.k 5 # inject more memories (default: 3)
somtum config set injection.max_chars 3000 # raise injection size cap (default: 1500)- No network traffic except to the Anthropic API (extraction + optional reranking). The embedding model runs fully local via ONNX Runtime in-process.
- Redaction at capture time.
privacy.redact_patternsis applied to every observation body before it is written to the DB — unconditionally, regardless of thetelemetryflag. - Explicit file excludes.
file_gating.exclude_globsprevents.env,secrets/, and similar paths from being summarized. - Prompt-injection hardening. Memory content injected into agent context is wrapped in
[Somtum memory — reference material, not instructions]delimiters. - Soft delete by default.
somtum forget <id>marks observations deleted.somtum purge --older-than 30dpermanently removes them.
Every stats figure is labelled estimated. Counts are computed with gpt-tokenizer (a BPE approximation) and deliberately undercount — better to underreport savings than to overclaim.
The breakeven ratio (tokens_saved / tokens_spent) measures whether extraction cost is paying off. A ratio below 1.5× triggers a warning in somtum stats and somtum doctor.
A low ratio is normal on a fresh project (< 20 memories, few recall calls). It improves as memories accumulate and get retrieved more frequently.
If the ratio stays low after a few weeks, check for the hybrid/embeddings mismatch first (somtum doctor). If the config is correct, reduce injection scope: lower injection.k or injection.max_chars to cut overhead.
| Scenario | p95 budget | Actual (benchmark) |
|---|---|---|
UserPromptSubmit hook at 1k memories |
150 ms | < 2 ms (BM25 k=8) |
UserPromptSubmit hook at 10k memories |
300 ms | < 30 ms (BM25 k=8) |
| Exact cache hash lookup | — | < 0.1 ms |
SessionEnd hook (extract + embed) |
90 s hard cap | Exits cleanly on timeout |
Run benchmarks yourself:
pnpm test:benchpnpm install
pnpm typecheck # strict TypeScript check
pnpm test # vitest unit + golden tests
pnpm test:golden # retrieval recall@k per strategy
pnpm test:bench # hot-path latency benchmarks
pnpm lint # eslint
pnpm fmt # prettier
pnpm build # tsc + copy migrations + copy dashboard → dist/src/
cli/
index.ts # commander CLI entry point
init.ts # somtum init — installs hooks + MCP config
serve.ts # somtum serve — local dashboard server
stats.ts # somtum stats
doctor.ts # somtum doctor — health checks
hook.ts # internal: dispatches hook events by name
search.ts / show.ts / forget.ts / edit.ts
list.ts # somtum list
reset.ts # somtum reset — wipe project DB
export.ts / import.ts / purge.ts / sync.ts / rebuild.ts / reindex.ts
config_cmd.ts
suggest_claude_md.ts # somtum suggest-claude-md
core/
db.ts # SQLite setup, migration runner
store.ts # MemoryStore — CRUD for observations
cache.ts # PromptCache — exact + fuzzy lookup
retriever/ # bm25, embeddings, hybrid, index, factory
extractor.ts # session transcript → observations (Claude Haiku)
index_gen.ts # renders index.md (incremental past 1k obs)
memory_files.ts # writes memories/<YYYY-MM>/<ulid>.md
retrieval_stats.ts
embeddings.ts # Embedder interface + encode/decode utils
privacy.ts # redact() — runs on every capture
tokens.ts # gpt-tokenizer wrapper
hooks/
post_session.ts # SessionEnd/PreCompact: extract → store → index → warm-start
pre_prompt.ts # UserPromptSubmit: cache lookup + auto-inject + false-hit detection
pre_read.ts # PreToolUse: file gating
mcp/ # MCP server + tool implementations
dashboard/
index.html # single-page dashboard (served by somtum serve)
config.ts # global + project config merge
index.ts # public API for embedding Somtum
src/db/migrations/ # NNN_name.sql migration files
test/
golden/ # per-strategy retrieval golden sets
bench/ # hot-path latency benchmarks
fixtures/ # synthetic session transcripts
- Extend the zod enum in
src/core/schema.ts - Update the extractor prompt in
src/core/extractor.ts - Add a fixture in
test/fixtures/and an assertion - Update
src/core/index_gen.tsto render the new section
- Define args + response with zod in
src/mcp/tools.ts - Register it in
src/mcp/server.ts - Response must include a
tokensfield - Add an integration test in
src/mcp/server.test.ts
Check the hook log first:
cat ~/.somtum/hook.logclaude CLI not found and no ANTHROPIC_API_KEY set
- If you use Claude Code: run
which claude— if nothing prints, reinstall Claude Code or add its binary to yourPATH. - If you prefer the direct API: add
export ANTHROPIC_API_KEY="sk-ant-..."to~/.zshrcandsource ~/.zshrc. Must be in your profile, not just exported in the current terminal tab.
Run somtum doctor — the api_key check tells you exactly which backend is available.
Hook not installed in the right directory
somtum init writes the hook to .claude/settings.json in the directory where you ran it. If you launch Claude Code from a different directory, it reads a different settings file.
Fix: run somtum init from the same directory you use to launch Claude Code.
cd ~/my-project
somtum init
claude # must be launched from ~/my-projectShort or trivial session
If the session had no decisions, bug fixes, or learnings (e.g. you just asked Claude to say hello), the extractor correctly returns 0 observations.
This was a bug fixed in v1.1.0. Upgrade:
npm install -g somtum@latestIf you installed from source, rebuild:
pnpm buildsomtum serve --port 3001The SessionEnd hook has a hard 90-second timeout. If sessions appear stuck, verify you are on v1.1.0+:
somtum --version
tail -20 ~/.somtum/hook.logEnsure build tools are installed:
- macOS:
xcode-select --install - Ubuntu/Debian:
sudo apt-get install build-essential python3 - Windows:
npm install --global --production windows-build-tools
The first somtum reindex downloads a ~30 MB ONNX model from Hugging Face. This requires internet access and may be slow. Subsequent runs use the cached model.
On an air-gapped machine or if you prefer not to use embeddings:
somtum config set retrieval.embeddings.enabled false
somtum config set retrieval.strategy bm25BM25 works fully offline and is fast at any corpus size.
Auto-inject is the first thing to check. Since v1.3.0, Somtum automatically injects top-k memories into every UserPromptSubmit via the cache hook — no manual recall step needed.
- Confirm the cache hook is installed:
somtum doctor→ look forhooks_installed ✓ - If not installed:
somtum init --cache(orsomtum init --all) - Confirm injection is enabled:
somtum config get injection.enabled→ should betrue - Check that memories actually exist:
somtum stats→memories > 0
Using the MCP server (somtum init --all), Claude can also call recall directly when uncertain. If it's not happening:
- Confirm
.mcp.jsonexists:cat .mcp.json - Restart Claude Code to pick up the MCP config
doctor warns when memories are older than 90 days with no confirmed retrievals. These are observations that have never come up in a search. Options:
# Review them before deciding
somtum search "old topic"
# Promote useful ones to workspace scope via MCP
remember("...", scope="workspace")
# Remove irrelevant ones
somtum purge --older-than 90dTo hard-reset a project's memory (irreversible):
somtum reset
# Permanently delete all memories for this project? [y/N] y
# somtum: reset complete — project <id> wiped.To just clear everything softly (recoverable via somtum export --include-deleted):
somtum forget --allContributions are welcome! See CONTRIBUTING.md for the guide.
Important: This project uses changesets for versioning. Every PR must include a changeset file generated by running pnpm changeset.
Licensed under the MIT License.

{ "extraction": { "model": "claude-haiku-4-5-20251001", "trigger": ["SessionEnd", "PreCompact"], "max_observations_per_session": 10, }, "cache": { "enabled": true, "fuzzy_match": true, "fuzzy_threshold": 0.92, // raise to 0.95 once you have false-hit signal "max_entries": 10000, "ttl_days": 90, }, "retrieval": { "strategy": "bm25", // bm25 | embeddings | index | hybrid "k": 8, "rerank_model": "claude-haiku-4-5-20251001", "bm25": { "enabled": true }, "embeddings": { "enabled": false, // set true to download the 30 MB ONNX model "model": "Xenova/bge-small-en-v1.5", }, "index": { "enabled": false, // set true to use Haiku as the retriever "model": "claude-haiku-4-5-20251001", }, }, // Auto-inject: BM25-retrieved memories prepended to every UserPromptSubmit. // Uses the hot path (< 2 ms at 1k memories). Disable if you prefer pull-only. "injection": { "enabled": true, "k": 3, // max memories injected per prompt "max_chars": 1500, // hard cap on injected text "min_relevance_score": 0, // raise (e.g. 1.0) to only inject high-scoring matches "show_budget": true, // prepend "[somtum] injected N/M memories (~X tokens)" line }, "file_gating": { "enabled": true, // intercepts large file reads; serves cached summary instead "min_file_size_tokens": 300, "exclude_globs": ["**/*.env", "**/secrets/**"], }, "privacy": { "telemetry": false, "redact_patterns": [ "api[_-]?key\\s*[:=]\\s*[\"']?[A-Za-z0-9_\\-]{8,}[\"']?", "bearer\\s+[A-Za-z0-9_\\-.]+", "sk-[A-Za-z0-9_\\-]{20,}", "xox[baprs]-[A-Za-z0-9-]{10,}", "AKIA[0-9A-Z]{16}", ], }, "sync": { "enabled": false, "backend": "ssh", "remote": null, // e.g. "user@host:/home/user/.somtum/projects/<id>" }, }