⚠️ Disclaimer: This project is not production-ready. It is currently an educational project intended for learning, experimentation, and research purposes only. Do not use it in production environments or for critical workloads.
memory_mcp is a Rust-based Model Context Protocol (MCP) server that gives AI agents a structured long-term memory layer backed by SurrealDB.
It is designed for workflows where agents need more than short-lived chat context: episodic memory, extracted entities and facts, bi-temporal validity, ranked context assembly, and graph-style relationships between people, companies, tasks, and decisions.
- Overview
- What it provides
- Architecture
- Quick start
- Configuration
- MCP tools
- Development
- Testing
- Project layout
- Documentation
- Contributing
- License
Memory MCP implements a memory system for AI agents with a few core goals:
- preserve important source material as episodes
- extract entities, facts, and links in a deterministic way
- track knowledge over both valid time and transaction time
- assemble compact, relevant context for downstream reasoning
- support scope-aware retrieval and access filtering
In practice, that means an agent can ingest content such as emails, notes, or working documents, resolve entities consistently, store facts with provenance, and later ask for ranked context instead of replaying entire histories.
- Bi-temporal knowledge model for valid time and ingestion time
- Episode ingestion for storing raw source material
- Entity resolution with alias handling and deterministic IDs
- Fact extraction for metrics, promises, and other structured knowledge
- Context assembly for ranked retrieval by query, scope, and time cutoff
- Graph relationships between episodes, entities, and facts
- Optional semantic retrieval providers including in-process
local-candle - Optional local GLiNER NER for zero-shot entity extraction
- SurrealDB support for embedded and remote deployments
- Optional watch-mode ingestion for filesystem-backed auto-ingest workflows
- MCP-native interface for tool-driven agent workflows
- Structured logging with predictable operational behavior
At a high level, the project follows a layered Rust design:
Agent / MCP client
│
▼
Memory MCP server (`src/mcp/`)
│
▼
Memory service layer (`src/service/`)
│
▼
Storage layer (`src/storage.rs` + SurrealDB)
| Module | Purpose |
|---|---|
mcp |
MCP handlers, params, parsers, and tool-facing types |
service |
Core business logic for ingest, extract, retrieval, graph operations, and validation |
storage |
Database integration and persistence helpers |
models |
Shared domain models and request/response types |
config |
Environment-driven configuration loading |
logging |
Logging setup and log-level utilities |
- Rust 1.85+
- SurrealDB-compatible runtime configuration
cargo build --releasecargo install --path .cargo runThe binary uses stdio transport, which makes it suitable for local MCP client integration.
SURREALDB_URL=rocksdb://./data/surreal.db \
SURREALDB_DB_NAME=memory \
SURREALDB_NAMESPACES=org,personal \
SURREALDB_USERNAME=root \
SURREALDB_PASSWORD=root \
RUST_LOG=info \
cargo run --quiet --bin memory_mcpThe watch mode turns a directory into a passive memory intake pipe: drop or save files and the server auto-ingests them into memory without any manual tool calls.
Why this exists In real workflows, important content already lands on disk — email exports (.eml), meeting notes (.md, .docx), requirements specs, sizing documents. Instead of manually calling ingest for each file, the watcher monitors a directory and feeds new or changed files through the full extraction pipeline (NER → entity resolution → fact extraction → embedding) automatically.
What it does
- Recursively watches a directory for file create and modify events
- Filters to supported file types only; unsupported files are silently skipped
- Deduplicates rapid successive events per file (coalescing)
- Dispatches qualifying files through the same
ingest→extractpipeline used by MCP tool calls - Logs every step with structured events (visible at
RUST_LOG=info/debug/trace)
Supported file types
| Extension | Format | Extracted content |
|---|---|---|
.pdf |
Text content (pages, paragraphs) | |
.docx |
Word document | Body text, headings, tables |
.xlsx |
Spreadsheet | Cell values, sheet structure |
.pptx |
Presentation | Slide text, speaker notes |
.md, .markdown |
Markdown | Headings, lists, code blocks |
.txt |
Plain text | Raw text content |
.eml |
Email message | Subject, sender, recipients, body, date |
Files with other extensions (.json, .png, .zip, etc.) are silently skipped.
User scenario
Example: auto-ingest a project inbox
# Terminal 1 — start the MCP server (stdio, for VS Code / Copilot)
SURREALDB_URL=rocksdb://./data/surreal.db \
SURREALDB_NAMESPACES=org \
SURREALDB_USERNAME=root \
SURREALDB_PASSWORD=root \
RUST_LOG=info \
cargo run --quiet --bin memory_mcp
# Terminal 2 — start the watcher on a project inbox
cargo run --features cli-watch --quiet -- \
watch ~/projects/atlas/inbox \
--project atlas \
--scope org \
--interval 5Now any file dropped or saved in ~/projects/atlas/inbox/ is automatically ingested:
# Drop an email export
cp ~/Downloads/kaspersky_july_2025.eml ~/projects/atlas/inbox/
# Save a requirements spec
echo "# Air-gapped deployment requirement..." > ~/projects/atlas/inbox/airgap_req.md
# Drop a sizing document
cp ~/Documents/hw_sizing.xlsx ~/projects/atlas/inbox/Within --interval seconds, each file is:
- Detected by the watcher
- Parsed (format-specific extraction)
- Ingested as an episode with
source_id = "watch:<path>" - Available for
extractandassemble_contextqueries
No manual ingest tool call needed.
How it works internally
Architecture flow
CLI: memory_mcp watch <dir> [--project X] [--scope Y] [--interval Z]
│
▼
FsWatcher::run_with_interval(dir, project, scope, interval, service)
│
├─ Validate: directory must exist and be readable
├─ Initialize: notify::RecommendedWatcher (polling mode)
├─ Watch: dir recursively for filesystem events
│
└─ EVENT LOOP (blocks on rx.recv())
│
├─ Event arrives (Create / Modify / Remove / Access / …)
│
├─ Filter: keep only Create + Modify events
├─ Filter: keep only supported file types (7 formats)
│
├─ Dedup: if same file triggered an ingest within
│ --interval seconds → skip (logged at trace)
│
├─ Determine metadata:
│ • source_id = "watch:<file_path>"
│ • source_type = "email" (.eml) or "document" (all others)
│ • project = CLI flag value
│ • scope = CLI flag value
│
├─ Dispatch: service.ingest(IngestRequest { content: <file_path> })
│ └─ Internally: read file → detect format → extract text → chunk
│
├─ Log: watcher.ingest_complete (with episode_id) at Info
│
└─ On error: log at Error and TERMINATE the watcher (fail-fast)
Deduplication behavior
How rapid saves are handled
When you save a file, editors often fire multiple filesystem events in quick succession (write + metadata + timestamp). The watcher prevents duplicate ingests:
- Each file's canonical path (symlinks resolved,
..normalized) is tracked in aHashMap - If the same file triggers another event within
--intervalseconds of its last ingest, the new event is skipped - Skipped events are logged at
tracelevel with reasoninterval_dedup
Example with --interval 5:
12:00:00 — note.md modified → ingested ✓
12:00:01 — note.md modified again → skipped (dedup, 1s < 5s)
12:00:02 — note.md modified again → skipped (dedup, 2s < 5s)
12:00:06 — note.md modified again → ingested ✓ (6s ≥ 5s)
The --interval flag serves dual purposes: it controls both the poll frequency (how often notify scans the directory) and the dedup window (minimum time between ingests of the same file).
Command-line reference
Flags and defaults
memory_mcp watch <dir> [OPTIONS]
Required:
<dir> Directory to watch (must exist and be readable)
Optional:
--project <name> Attach ingested episodes to a project (default: none)
--scope <scope> Scope for namespace resolution (default: "org")
--interval <secs> Poll interval + dedup window in seconds (default: 2, min: 1)
Important notes:
- The
watchsubcommand requires thecli-watchfeature. Without it, the binary returns an error. - The watcher is fail-fast: any ingest error terminates the entire watch loop.
- The watcher does not diff content — every qualifying event triggers a full re-ingest of the file.
Remove,Access, andMetadatachange events are ignored.
Logging during watch
What to expect at each log level
| Level | Watch events you'll see |
|---|---|
info |
watcher.ready (startup), watcher.ingest_complete (with episode_id) |
debug |
watcher.ingest_dispatch (file path, source_type, project, scope) |
trace |
watcher.event_skipped (dedup reason, elapsed vs interval) |
warn |
— (none specific to watch) |
error |
watcher.ingest_error (fatal — watcher terminates) |
Example info-level output for a successful ingest:
[2026-04-13T12:00:00.123Z] INFO req=- op=watcher.ingest_complete episode_id=episode:abc123 path=watch:/inbox/note.md source_type=document
If you run the server directly from this workspace, a stdio host configuration can point at Cargo:
{
"mcpServers": {
"memory-mcp": {
"command": "cargo",
"args": ["run", "--quiet", "--bin", "memory_mcp"],
"cwd": "/path/to/memory_mcp",
"env": {
"SURREALDB_URL": "rocksdb://./data/surreal.db",
"SURREALDB_DB_NAME": "memory",
"SURREALDB_NAMESPACES": "org,personal",
"SURREALDB_USERNAME": "root",
"SURREALDB_PASSWORD": "root",
"RUST_LOG": "info"
}
}
}
}After cargo build --release or cargo install --path ., you can switch command to ./target/release/memory_mcp or memory_mcp respectively.
Configuration is loaded from environment variables.
| Variable | Required | Description |
|---|---|---|
SURREALDB_DB_NAME |
Yes | Database name |
SURREALDB_NAMESPACES |
Yes | Comma-separated namespace list |
SURREALDB_USERNAME |
Yes | Database username |
SURREALDB_PASSWORD |
Yes | Database password |
SURREALDB_URL |
Yes for remote mode | SurrealDB connection URL |
| Variable | Description |
|---|---|
SURREALDB_EMBEDDED |
Set to true to use embedded mode |
SURREALDB_DATA_DIR |
Custom embedded data directory |
RUST_LOG |
Logging level such as trace, debug, info, warn, or error |
QUERY_LOGGING_ENABLED |
Set to true to persist assemble_context analytics rows into query_log (default: false) |
QUERY_LOG_RETENTION_DAYS |
Days to retain persisted query_log analytics before best-effort pruning (default: 90) |
LIFECYCLE_ENABLED |
Enable background lifecycle jobs (true/false, default: false) |
LIFECYCLE_DECAY_INTERVAL_SECS |
Decay worker interval in seconds (default: 3600) |
LIFECYCLE_ARCHIVAL_INTERVAL_SECS |
Archival worker interval in seconds (default: 86400) |
LIFECYCLE_DECAY_THRESHOLD |
Confidence threshold for fact invalidation (default: 0.3) |
LIFECYCLE_ARCHIVAL_AGE_DAYS |
Days before archiving episodes (default: 90) |
LIFECYCLE_DECAY_HALF_LIFE_DAYS |
Half-life in days for decay computation (default: 365) |
EMBEDDINGS_ENABLED |
Enable semantic retrieval providers (default: false) |
EMBEDDINGS_PROVIDER |
Embedding backend: local-candle, openai-compatible, or ollama |
EMBEDDINGS_MODEL |
Model identifier for the selected embedding provider |
EMBEDDINGS_MODEL_DIR |
Optional local cache directory for local-candle |
EMBEDDINGS_MAX_TOKENS |
Max token budget before local-candle chunks long inputs |
EMBEDDINGS_TIMEOUT_SECS |
Timeout for remote embedding calls |
EMBEDDINGS_SIMILARITY_THRESHOLD |
Minimum cosine similarity for semantic matches |
EMBEDDINGS_API_KEY |
Optional bearer token for OpenAI-compatible providers |
NER_PROVIDER |
Entity extraction backend: regex, anno, or local-gliner |
NER_MODEL |
HuggingFace repo for local-gliner |
NER_MODEL_DIR |
Optional local cache directory for local-gliner |
NER_LABELS |
Comma-separated runtime labels for local-gliner |
NER_THRESHOLD |
Confidence threshold for local-gliner acceptance |
NER_BATCH_SIZE |
Batch size for local-gliner inference |
SURREALDB_DB_NAME=memory
SURREALDB_NAMESPACES=org,personal
SURREALDB_USERNAME=root
SURREALDB_PASSWORD=root
SURREALDB_URL=ws://127.0.0.1:8000/rpc
SURREALDB_EMBEDDED=false
RUST_LOG=info
QUERY_LOGGING_ENABLED=false
QUERY_LOG_RETENTION_DAYS=90
# Lifecycle background jobs (optional)
LIFECYCLE_ENABLED=true
LIFECYCLE_DECAY_INTERVAL_SECS=3600
LIFECYCLE_ARCHIVAL_INTERVAL_SECS=86400
LIFECYCLE_DECAY_THRESHOLD=0.3
LIFECYCLE_ARCHIVAL_AGE_DAYS=90
# LIFECYCLE_DECAY_HALF_LIFE_DAYS=365
# Optional local model configuration
# EMBEDDINGS_ENABLED=true
# EMBEDDINGS_PROVIDER=local-candle
# EMBEDDINGS_MODEL=intfloat/multilingual-e5-small
# EMBEDDINGS_MODEL_DIR=./data/models/intfloat/multilingual-e5-small
# NER_PROVIDER=local-gliner
# NER_MODEL=urchade/gliner_multi-v2.1The server supports three embedding backends, controlled by EMBEDDINGS_PROVIDER:
| Provider | What it is | Default dimension | Requires network? |
|---|---|---|---|
local-candle |
In-process BERT model via Candle (Rust ML) | 384 | Only for first download |
openai-compatible |
External OpenAI-compatible HTTP API | 1536 (configurable) | Yes, every call |
ollama |
External Ollama HTTP API | 1536 (configurable) | Yes, every call |
- The server resolves a target embedding identity from the configured provider, model, base URL, and effective dimension.
- That identity is persisted per namespace in
embedding_state:factas anactive_signatureonce the namespace is known to be compatible. - In normal
serve/watchstartup, every configured namespace is checked before semantic retrieval is enabled. - If a namespace is already marked
readyfor the same signature, semantic retrieval starts normally. - If a namespace is missing state but is clearly compatible (empty namespace or sampled legacy vectors all match the current dimension), the service bootstraps a
readystate automatically. - If any namespace is marked
rebuilding,failed, or has embeddings that do not match the configured target, the service degrades to lexical/graph-only retrieval instead of mixing incompatible vectors.
That last point is the safety rail: after a provider switch, normal MCP traffic keeps working, but semantic retrieval is intentionally disabled until embeddings are rebuilt.
To switch, change the environment variables and restart the process. The server does not silently rewrite old vectors during normal startup.
Instead, the runtime now separates two modes:
- Normal mode (
memory_mcpormemory_mcp watch ...) — safe startup checks run first. If stored embeddings are incompatible with the configured target, semantic retrieval is disabled and the process logsembedding.rebuild_required. - Maintenance mode (
memory_mcp reembed) — a dedicated one-shot command that forces the configured embedding provider on, rewrites every fact embedding, persists progress, and exits when complete.
This keeps the public MCP tool surface unchanged while giving operators a deterministic recovery path after provider changes.
Use the maintenance command after changing any embedding target that should become authoritative for stored facts:
EMBEDDINGS_PROVIDEREMBEDDINGS_MODELEMBEDDINGS_BASE_URL- effective embedding dimension (including override/probe changes)
Example:
memory_mcp reembedFrom the workspace during development:
cargo run --quiet --bin memory_mcp -- reembedWhat the command does:
- Resolves the configured target signature and dimension.
- Loads or creates a persisted control-plane job record at
embedding_job:fact_reembed. - Marks each namespace as
rebuildinginembedding_state. - Rewrites all fact embeddings, including invalidated / historical facts.
- Stores fresh metadata on each fact (
embedding_provider,embedding_model,embedding_dimension,embedding_signature,embedding_updated_at). - Marks namespaces
readyon success, orfailedif the job stops on an error.
The job is restart-safe for the same target signature: if the process stops mid-run, invoking memory_mcp reembed again resumes from the persisted per-namespace cursor instead of starting from scratch.
The maintenance flow is designed to be debuggable from logs alone.
Key structured events include:
reembed.job_startedreembed.namespace_startedreembed.batch_fetchedreembed.progressreembed.fact_failedreembed.job_failedmain.reembed_completed
Progress logs include counts and throughput data such as:
- processed / succeeded / failed facts
- total facts
- facts per second
- ETA in seconds (when enough progress exists to estimate it)
At the end of a successful run, main.reembed_completed logs a compact summary with totals and elapsed time.
To restore semantic retrieval safely after a provider change:
- Change the embedding environment variables.
- Run
memory_mcp reembed(orcargo run --quiet --bin memory_mcp -- reembed). - Wait for the maintenance run to complete successfully.
- Start the normal MCP server again.
Until step 3 completes, the server may intentionally run with semantic retrieval disabled while lexical and graph-based retrieval continue to work.
For external embedding backends (openai-compatible and ollama), the server now treats transient provider issues differently from hard configuration errors.
Bounded retries with backoff are applied automatically for:
- request timeouts / connect failures
- HTTP
429rate limits - retryable upstream statuses such as
408,425,500,502,503, and504
If those retries still do not recover the provider:
- write-paths keep the fact write and schedule an in-memory background retry to fill in the missing embedding later;
- query-time semantic retrieval falls back to lexical / graph-only results for the current request and schedules a background warm-up of a short-lived query embedding cache for repeated identical queries;
memory_mcp reembedstill stops after bounded retries and keeps the maintenance job in a failed state so operators can fix the provider and rerun it explicitly.
Important limitation: the deferred background path is intentionally in-memory only right now. If the process restarts before a background retry succeeds, those deferred retries are lost and will be attempted again only when a new request hits the same path.
The EMBEDDINGS_SIMILARITY_THRESHOLD (default 0.7) filters semantic search results: only facts with cosine similarity ≥ threshold are returned. After a provider switch, this threshold effectively filters out all old facts because cross-provider similarity scores are meaningless.
If you only use lexical (BM25/FTS) retrieval and graph-expanded context assembly, the provider switch has no impact on those retrieval tiers — they do not use embeddings.
assemble_context remains lexical/BM25-first, but now applies deterministic query-mode routing before ranking results:
- explicit
view_modestill wins; - temporal-history queries such as
timeline of Atlas changes in Q1 2026automatically resolve to timeline ordering whenview_modeis omitted; - named entity anchors can expand into 1-hop graph context (2 hops for explicit connection/path questions) without requiring semantic retrieval.
Persisted query analytics are optional and disabled by default.
When QUERY_LOGGING_ENABLED=true, successful assemble_context calls write a row to the query_log table with:
scopequeryprojectview_moderesolved_view_modequery_flagsretrieval_tiersresult_countlatency_msretrieval_tiercache_hitlogged_at
Old query_log rows are pruned with a best-effort retention pass after successful writes. By default, rows older than 90 days are deleted; override this with QUERY_LOG_RETENTION_DAYS=<days>.
This switch only controls database-backed query analytics. Regular runtime logs still follow RUST_LOG.
memory_mcp emits structured logs across the plan-added functionality using the standard levels below:
info— lifecycle milestones and successful high-level operations such asingest,extract,assemble_context, watcher startup, watcher ingest completion, and community rebuild passesdebug— feature-path decisions such as document ingest transport detection (file/directory/url/inline), project/view-mode selection, graph insight assembly, hub/community map building, and successfulquery_logwrites when enabledtrace— fine-grained diagnostics such as cache misses/sets,query_logskips when disabled, watcher dedup skips, retrieval-tier summaries, appendedexperiencefacts, and per-namespace community rebuild detailswarn— recoverable issues such as unknownview_modefallback, access-heat tracking failures, query analytics write failures, and degraded worker passeserror— terminal failures such as watcher ingest errors or process-level startup/serve failures
Recommended presets:
RUST_LOG=infofor normal local/server usageRUST_LOG=debugwhen validating new ingest/view-mode/graph behaviorRUST_LOG=tracewhen debugging retrieval tiers, cache behavior, or watcher dedup decisions
An .env file already exists in the repository root, so you can keep local values there if your MCP host or shell loads it.
The public MCP surface is centered on a small set of high-value operations rather than endpoint-by-endpoint plumbing.
| Tool | Purpose |
|---|---|
ingest |
Store an episode with source metadata and timestamps |
extract |
Extract entities, facts, and links from an episode or raw content |
resolve |
Canonicalize an entity name and aliases into a stable entity record |
assemble_context |
Return ranked memory context for a query |
explain |
Expand context items with source citations and multi-source provenance |
invalidate |
Mark a fact as no longer valid as of a given time |
open_app |
Launch an optional MCP app session and return a session-backed resource URI |
app_command |
Execute coarse-grained actions against an open MCP app session |
When the MCP host supports resources, the server also exposes app discovery and session resources such as ui://memory/apps and ui://memory/app/{app}/{session_id} for inspector, diff, ingestion review, lifecycle, and graph views.
The explain() operation returns complete provenance lineage for each fact:
- Direct sources — episodes that directly generated the fact
- Linked sources — episodes connected via shared entities
Returns:
all_sources: Array of provenance sources including:episode_id: Source episode identifierepisode_content: Excerpt from the episodeepisode_t_ref: Episode timestamprelationship: "direct" (created fact) or "linked" (via entity)entity_path: Path from fact to episode via entity (if linked)
This enables full audit trails, understanding of information propagation, and building trust through transparency.
This design lines up with the intent-driven MCP guidance reflected in the docs: fewer tools, clearer semantics, better outcomes.
As of 2026-03-27, memory_mcp implements adaptive memory alignment with SOTA research:
-
Fact-augmented index keys: Entity names, aliases, and temporal markers (month-year, ISO dates) indexed at ingest for enriched BM25 retrieval. FTS matches on both
contentandindex_keys. -
Heat-aware lifecycle: Recently-accessed facts protected from decay/archival via
access_countandlast_accessedfields. Retrieval increments by 1, explain increments by 3 (stronger signal). -
Timeline retrieval:
assemble_contextsupportsview_mode=timelinewith optionalwindow_start/window_endfor chronological queries. Results sorted byt_valid(oldest first). -
LongMemEval-style acceptance tests: Coverage for multi-session reasoning, temporal reasoning, knowledge update, abstention, and direct fact lookup.
See docs/superpowers/specs/2026-03-27-sota-memory-alignment-design.md for target-state design and docs/MEMORY_SYSTEM_SPEC.md for current runtime contract.
cargo check
cargo fmt
cargo clippy -- -D warnings
cargo doc --no-depssrc/main.rs— main MCP server binary
MCP input/output schemas are exposed by the server itself through the protocol's tool metadata and remain regression-covered by the schema tests under src/mcp/.
Run the full test suite:
cargo testUseful narrower runs:
cargo test --test service_integration
cargo test --test service_acceptance
cargo test --test tools_e2eVerified in this remediation pass:
cargo test semantic_scaffolding --test service_integration→2 passed; 0 failedcargo test --test service_acceptance→11 passed; 0 failedcargo test --test service_integration→11 passed; 0 failed
Coverage output is stored under coverage/ when generated with Tarpaulin.
.
├── AGENTS.md
├── Cargo.toml
├── README.md
├── docs/
├── scripts/
├── src/
│ ├── mcp/
│ ├── service/
│ ├── config.rs
│ ├── lib.rs
│ ├── logging.rs
│ ├── main.rs
│ ├── models.rs
│ └── storage.rs
└── tests/
docs/MEMORY_SYSTEM_SPEC.md— full system specificationdocs/SIMPLIFIED_SEARCH_REDESIGN_SPEC.md— target-state spec for the upcoming breaking search simplificationdocs/INTENT_DRIVEN_MCP_DESIGN_GUIDE.md— curated references for intent- and skills-driven MCP designdocs/security-hardening-roadmap.md— current query-surface inventory, deployment assumptions, and remaining hardening work
This repository follows the conventions in AGENTS.md.
In particular:
- keep public APIs stable unless a change is explicitly requested
- avoid introducing dependencies without approval
- prefer typed errors and deterministic behavior
- run formatting, clippy, and tests before considering work done
This project is licensed under the MIT license. See LICENSE for details.