Skip to content

CLI access to Basic Memory via auto-starting server (Ollama pattern) #683

@bm-clawd

Description

@bm-clawd

Summary

Enable agents to use Basic Memory through a standard CLI (bm find, bm ls, bm read, etc.) without requiring users to manually manage a daemon process. The server should auto-start on first CLI invocation and stay warm for subsequent calls — the same pattern Ollama uses.

This would make BM skills distributable and usable by any agent framework (Claude Code, Codex, Cursor, etc.) without requiring MCP support or manual server setup.

Problem

The Daemon Dilemma

Basic Memory's data lives in SQLite, which needs a managed process for good performance. A naive CLI approach (open SQLite → check migrations → initialize FTS → query → exit) has unacceptable cold-start overhead — multiple seconds per invocation. An agent making 5 BM calls in a conversation would burn 15+ seconds just on startup.

But requiring users to manually start a daemon (bm server start) kills the skill distribution story. When someone installs a BM skill into Claude Code or Cursor, the skill says run bm find ... and... what process answers? The user didn't spin up anything. They just installed a skill.

Tools that work well in the skill ecosystem are daemonless: git, gh, rg, jq. None require "start a server." They just work.

Prior Art: Tools That Solved This

  • Ollama: ollama run llama3 just works. If the server isn't running, the CLI starts it automatically. User never manages a daemon.
  • gpg-agent / ssh-agent: Auto-launch on first use. Subsequent calls find the existing socket.
  • Docker Desktop: Auto-starts dockerd when the CLI needs it.

Proposal: The Ollama Pattern

Auto-Starting Server

$ bm find "auth decisions"
# 1. Check if BM server is running (pidfile/socket check)
# 2. If not → spawn bm-server in background, wait for ready
# 3. Send query to running server
# 4. Print results
# 5. Server stays alive for future calls (idle timeout: configurable)

First call has a ~2s hit to boot. Every subsequent call is fast (~50ms) because the server is warm:

First call:   bm find "stuff" → [no server] → start (2s) → query (50ms) → result
After:        bm find "stuff" → [server hot] → query (50ms) → result

Shared Server Process

The MCP server and CLI server should be the same process, listening on both MCP transport and a local socket/HTTP port. Whoever arrives first starts it:

                        ┌──────────────────────┐
Claude Desktop ──MCP──→ │                      │
Cursor ──────────MCP──→ │   BM Server Process  │──→ SQLite
                        │   (auto-started)      │──→ Notes/
bm find ─────socket───→ │                      │
bm ls ───────socket───→ │                      │
                        └──────────────────────┘

Server Lifecycle

~/.basic-memory/
├── notes/
├── basic_memory.db
├── server.pid          ← PID of running server
└── server.sock         ← Unix domain socket (or localhost HTTP port)
  • Auto-starts on first CLI or MCP connection
  • Stays alive while clients are connected or until idle timeout
  • Auto-exits after idle period (30 min? configurable?)
  • Platform-appropriate: launchd (macOS), systemd (Linux)

CLI Surface: Coarse "Browse" Interface

Rather than mapping each BM operation to a separate MCP tool, consolidate into a filesystem-inspired CLI that agents already understand intuitively:

Browsing (read-only, most common for agents)

bm ls notes/                        # List directory contents
bm ls notes/ --type person          # Filter by note type
bm tree projects/ -L 2              # Directory tree view
bm read "Project X"                 # Read a specific note
bm find "authentication decisions"  # Semantic search
bm graph "Project X" --depth 2      # Traverse relations (BM-unique!)
bm grep "OAuth" --folder projects/  # Content search

Writing

bm write "Meeting Notes" --folder meetings/ --content "..."
bm edit "Project X" --append "New observation..."

As a Single MCP Tool

This could also be exposed as a single coarse MCP tool (reducing tool count in prompts):

{
  "tool": "memory_browse",
  "command": "find 'authentication decisions' --type decision"
}

One tool, many commands. The agent composes naturally because LLMs already know shell semantics.

Inspiration

This was partly inspired by looking at OpenViking's approach, which uses a similar filesystem paradigm (viking:// URIs with ls, tree, find, grep operations) backed by a persistent server. The difference is BM's data is already human-readable markdown on disk — the CLI just needs a fast path to the search index.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions