CLI access to Basic Memory via auto-starting server (Ollama pattern)

## Summary

Enable agents to use Basic Memory through a standard CLI (`bm find`, `bm ls`, `bm read`, etc.) without requiring users to manually manage a daemon process. The server should auto-start on first CLI invocation and stay warm for subsequent calls — the same pattern Ollama uses.

This would make BM skills distributable and usable by any agent framework (Claude Code, Codex, Cursor, etc.) without requiring MCP support or manual server setup.

## Problem

### The Daemon Dilemma

Basic Memory's data lives in SQLite, which needs a managed process for good performance. A naive CLI approach (open SQLite → check migrations → initialize FTS → query → exit) has unacceptable cold-start overhead — multiple seconds per invocation. An agent making 5 BM calls in a conversation would burn 15+ seconds just on startup.

But requiring users to manually start a daemon (`bm server start`) kills the skill distribution story. When someone installs a BM skill into Claude Code or Cursor, the skill says `run bm find ...` and... what process answers? The user didn't spin up anything. They just installed a skill.

Tools that work well in the skill ecosystem are daemonless: `git`, `gh`, `rg`, `jq`. None require "start a server." They just work.

### Prior Art: Tools That Solved This

- **Ollama**: `ollama run llama3` just works. If the server isn't running, the CLI starts it automatically. User never manages a daemon.
- **gpg-agent / ssh-agent**: Auto-launch on first use. Subsequent calls find the existing socket.
- **Docker Desktop**: Auto-starts `dockerd` when the CLI needs it.

## Proposal: The Ollama Pattern

### Auto-Starting Server

```bash
$ bm find "auth decisions"
# 1. Check if BM server is running (pidfile/socket check)
# 2. If not → spawn bm-server in background, wait for ready
# 3. Send query to running server
# 4. Print results
# 5. Server stays alive for future calls (idle timeout: configurable)
```

First call has a ~2s hit to boot. Every subsequent call is fast (~50ms) because the server is warm:

```
First call:   bm find "stuff" → [no server] → start (2s) → query (50ms) → result
After:        bm find "stuff" → [server hot] → query (50ms) → result
```

### Shared Server Process

The MCP server and CLI server should be the **same process**, listening on both MCP transport and a local socket/HTTP port. Whoever arrives first starts it:

```
                        ┌──────────────────────┐
Claude Desktop ──MCP──→ │                      │
Cursor ──────────MCP──→ │   BM Server Process  │──→ SQLite
                        │   (auto-started)      │──→ Notes/
bm find ─────socket───→ │                      │
bm ls ───────socket───→ │                      │
                        └──────────────────────┘
```

### Server Lifecycle

```
~/.basic-memory/
├── notes/
├── basic_memory.db
├── server.pid          ← PID of running server
└── server.sock         ← Unix domain socket (or localhost HTTP port)
```

- Auto-starts on first CLI or MCP connection
- Stays alive while clients are connected or until idle timeout
- Auto-exits after idle period (30 min? configurable?)
- Platform-appropriate: launchd (macOS), systemd (Linux)

## CLI Surface: Coarse "Browse" Interface

Rather than mapping each BM operation to a separate MCP tool, consolidate into a filesystem-inspired CLI that agents already understand intuitively:

### Browsing (read-only, most common for agents)

```bash
bm ls notes/                        # List directory contents
bm ls notes/ --type person          # Filter by note type
bm tree projects/ -L 2              # Directory tree view
bm read "Project X"                 # Read a specific note
bm find "authentication decisions"  # Semantic search
bm graph "Project X" --depth 2      # Traverse relations (BM-unique!)
bm grep "OAuth" --folder projects/  # Content search
```

### Writing

```bash
bm write "Meeting Notes" --folder meetings/ --content "..."
bm edit "Project X" --append "New observation..."
```

### As a Single MCP Tool

This could also be exposed as a single coarse MCP tool (reducing tool count in prompts):

```json
{
  "tool": "memory_browse",
  "command": "find 'authentication decisions' --type decision"
}
```

One tool, many commands. The agent composes naturally because LLMs already know shell semantics.

## Inspiration

This was partly inspired by looking at [OpenViking](https://github.com/volcengine/OpenViking)'s approach, which uses a similar filesystem paradigm (`viking://` URIs with `ls`, `tree`, `find`, `grep` operations) backed by a persistent server. The difference is BM's data is already human-readable markdown on disk — the CLI just needs a fast path to the search index.

## Related

- #682 — FastMCP Skills Provider + CLI skill loading (complementary: skills provider for MCP clients, this issue for CLI access pattern)
- [basic-memory-skills repo](https://github.com/basicmachines-co/basic-memory-skills) — skills that would benefit from CLI access

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI access to Basic Memory via auto-starting server (Ollama pattern) #683

Summary

Problem

The Daemon Dilemma

Prior Art: Tools That Solved This

Proposal: The Ollama Pattern

Auto-Starting Server

Shared Server Process

Server Lifecycle

CLI Surface: Coarse "Browse" Interface

Browsing (read-only, most common for agents)

Writing

As a Single MCP Tool

Inspiration

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

CLI access to Basic Memory via auto-starting server (Ollama pattern) #683

Description

Summary

Problem

The Daemon Dilemma

Prior Art: Tools That Solved This

Proposal: The Ollama Pattern

Auto-Starting Server

Shared Server Process

Server Lifecycle

CLI Surface: Coarse "Browse" Interface

Browsing (read-only, most common for agents)

Writing

As a Single MCP Tool

Inspiration

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions