Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 127 additions & 0 deletions docs/advanced/mcp-server.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# MCP Server

memv ships an [MCP](https://modelcontextprotocol.io) server that exposes its memory operations as tools any MCP-compatible client (Claude Desktop, Claude Code, Cursor, custom agents) can call.

## Install

```bash
uv add "memvee[mcp]"
# or
pip install "memvee[mcp]"
```

This pulls in the `mcp` package alongside memv. Combine with other extras as needed, e.g. `memvee[mcp,postgres]`.

## Run

```bash
memv-mcp --db-url memory.db --llm-model openai:gpt-4o-mini
```

By default the server speaks `stdio` — the transport every desktop MCP client expects.

### CLI options

| Flag | Default | Description |
|------|---------|-------------|
| `--db-url` | *required* | SQLite path or `postgresql://...` URL. |
| `--user-id` | `default` | Default `user_id` applied to every tool call when the caller doesn't pass one. |
| `--embedding-provider` | `openai` | `openai`, `voyage`, `cohere`, or `local` (FastEmbed). |
| `--embedding-model` | provider default | Override the embedding model. |
| `--embedding-dimensions` | provider default | Override vector dimensions. Must match the model. |
| `--llm-model` | *none* | PydanticAI model string (e.g. `openai:gpt-4o-mini`). Without this, knowledge extraction is disabled. |
| `--transport` | `stdio` | `stdio` or `streamable-http`. |

!!! note "LLM is optional"
Without `--llm-model`, `add_conversation` stores messages but does not extract knowledge. `search_memory` and `add_memory` still work — they don't need an LLM.

!!! warning "add_conversation latency"
With an LLM configured, `add_conversation` runs segmentation and predict-calibrate extraction inline before returning. This can take 10–30+ seconds on long histories. Raise your MCP client's tool-call timeout accordingly (Claude Desktop defaults to ~60 s).

## Tools

| Tool | Purpose |
|------|---------|
| `search_memory(query, user_id?, top_k=10)` | Hybrid retrieval (vector + BM25 + RRF). Returns an LLM-ready prompt block. |
| `add_memory(statement, user_id?)` | Store a fact directly. Deduplicates against existing knowledge. |
| `add_conversation(user_message, assistant_message, user_id?)` | Append an exchange. Triggers extraction when an LLM is configured. |
| `list_memories(user_id?, limit=20, offset=0)` | Page through stored knowledge. |
| `delete_memory(knowledge_id)` | Permanently remove an entry by UUID. |

All `user_id` arguments are optional — the server falls back to the `--user-id` default when omitted.

## Client setup

=== "Claude Desktop"

Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or the equivalent path on your platform:

```json
{
"mcpServers": {
"memv": {
"command": "memv-mcp",
"args": [
"--db-url", "/absolute/path/to/memory.db",
"--user-id", "your-name",
"--llm-model", "openai:gpt-4o-mini"
],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}
```

=== "Claude Code"

```bash
claude mcp add memv -- memv-mcp \
--db-url /absolute/path/to/memory.db \
--user-id your-name \
--llm-model openai:gpt-4o-mini
```

=== "Cursor"

In `~/.cursor/mcp.json`:

```json
{
"mcpServers": {
"memv": {
"command": "memv-mcp",
"args": ["--db-url", "/absolute/path/to/memory.db", "--llm-model", "openai:gpt-4o-mini"]
}
}
}
```

## HTTP transport

For remote agents, run with `--transport streamable-http`:

```bash
memv-mcp --db-url memory.db --llm-model openai:gpt-4o-mini --transport streamable-http
```

The server listens on the default MCP HTTP port. Put it behind your own auth/proxy before exposing it.

## Programmatic use

The server factory is importable, so you can mount it inside an existing process or inject custom clients (e.g. for tests):

```python
from memv.mcp.server import create_server

server = create_server(
db_url="memory.db",
default_user_id="alice",
embedding_client=my_embedder,
llm_client=my_llm,
)
server.run(transport="stdio")
```

The tool implementations are exported as plain `do_*` coroutines (`do_search_memory`, `do_add_memory`, …) so you can unit-test them without an MCP runtime.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ nav:
- PostgreSQL: advanced/backends/postgres.md
- Custom Providers: advanced/custom-providers.md
- Async Processing: advanced/async-processing.md
- MCP Server: advanced/mcp-server.md
- Examples:
- examples/index.md
- PydanticAI: examples/pydantic-ai.md
Expand Down
4 changes: 4 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,14 @@ postgres = [
"asyncpg>=0.30.0",
"pgvector>=0.3.6",
]
mcp = ["mcp>=1.0.0"]
voyage = ["voyageai>=0.3.0"]
cohere = ["cohere>=5.0.0"]
local = ["fastembed>=0.6.0"]

[project.scripts]
memv = "memv:main"
memv-mcp = "memv.mcp.__main__:main"

[project.urls]
Homepage = "https://github.com/vstorm-co/memv"
Expand All @@ -59,6 +61,7 @@ dev = [
"cohere>=5.0.0",
"fastembed>=0.6.0",
"ipython>=9.9.0",
"mcp>=1.0.0",
"pgvector>=0.3.6",
"voyageai>=0.3.0",
"pre-commit>=4.5.1",
Expand All @@ -71,6 +74,7 @@ docs = [
"mkdocs>=1.6",
"mkdocs-material>=9.6",
"mkdocstrings[python]>=0.28",
"griffe>=1.0,<2",
]
examples = [
"autogen-agentchat>=0.7",
Expand Down
Empty file added src/memv/mcp/__init__.py
Empty file.
47 changes: 47 additions & 0 deletions src/memv/mcp/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
"""CLI entry point for the memv MCP server."""

from __future__ import annotations

import argparse


def main() -> None:
parser = argparse.ArgumentParser(
prog="memv-mcp",
description="memv MCP server — expose memory operations to AI agents",
)
parser.add_argument("--db-url", required=True, help="Database URL (SQLite file path or postgresql://...)")
parser.add_argument("--user-id", default="default", help="Default user ID for all operations (default: 'default')")
parser.add_argument(
"--embedding-provider",
default="openai",
choices=["openai", "voyage", "cohere", "local"],
help="Embedding provider (default: openai)",
)
parser.add_argument("--embedding-model", default=None, help="Override default embedding model for the chosen provider")
parser.add_argument("--embedding-dimensions", type=int, default=None, help="Override embedding dimensions")
parser.add_argument(
"--llm-model",
default=None,
help="LLM model for knowledge extraction (PydanticAI model string, e.g. 'openai:gpt-4o-mini'). "
"Without this, add_conversation stores messages but cannot extract knowledge.",
)
parser.add_argument("--transport", default="stdio", choices=["stdio", "streamable-http"], help="MCP transport (default: stdio)")

args = parser.parse_args()

from memv.mcp.server import create_server

server = create_server(
db_url=args.db_url,
default_user_id=args.user_id,
embedding_provider=args.embedding_provider,
embedding_model=args.embedding_model,
embedding_dimensions=args.embedding_dimensions,
llm_model=args.llm_model,
)
server.run(transport=args.transport)


if __name__ == "__main__":
main()
22 changes: 22 additions & 0 deletions src/memv/mcp/dev.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
"""Dev entry point for `mcp dev` / MCP Inspector.

Usage:
uv run mcp dev src/memv/mcp/dev.py

Reads config from environment variables:
MEMV_DB_URL — database path (default: /tmp/memv-dev.db)
MEMV_USER_ID — default user ID (default: dev)
MEMV_EMBEDDING — embedding provider (default: openai)
MEMV_LLM_MODEL — LLM model string (optional)
"""

import os

from memv.mcp.server import create_server

mcp = create_server(
db_url=os.environ.get("MEMV_DB_URL", "/tmp/memv-dev.db"),
default_user_id=os.environ.get("MEMV_USER_ID", "dev"),
embedding_provider=os.environ.get("MEMV_EMBEDDING", "openai"),
llm_model=os.environ.get("MEMV_LLM_MODEL"),
)
Loading
Loading