-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture
The design rationale behind box-memory.
AI agents need persistent memory + persistent file storage. Today's options:
- Vector-store / RAG tools (Mem0, Supermemory, ChatGPT Memories) chunk files into embeddings, lose provenance, drift across model versions, treat files as second-class metadata behind chunks. Wrong for anything regulated.
- Markdown-vault tools (Obsidian + iCloud / Git) keep files whole. Single-user, no compliance, no ACLs.
box-memory picks Camp B's data model (whole files, exact recall) and runs it on Box's substrate (compliance, ACLs, retention, durability). The plugin layers the agent-memory primitives — schema, index, companions, recall — on top.
Box is the only major file storage that's:
- Certified for regulated workloads — SOC 2 on Business+, HIPAA BAA + FedRAMP Moderate on Enterprise, FedRAMP High + DoD IL4 + ITAR on Enterprise Plus / Government
- API-first with immutable file IDs — Box file IDs never change across renames, moves, or version updates. The plugin keys everything on these IDs.
- Native CAD / Office / image / PDF previews — agents can show "this is the file" without a separate viewer service
- Folder ACLs that actually work — multi-team isolation is a folder permission, not an application convention
- Real Box AI — Ask, Extract, AI Studio, Hubs Q&A. Server-side, tier-gated, but high-quality.
-
Working official MCP at
mcp.box.com— Claude can talk to it natively
What Box is bad at: free-text search. Search API has a ~10 minute indexing lag. This is the central design constraint of box-memory. The plugin's index files exist specifically to route around it.
Filenames change. Wikilinks rot. Slugs collide. Box file IDs are permanent.
The plugin uses two primary keys:
-
memory ID (
mem_<ulid>) — agent-facing - Box file ID — substrate-facing
The per-folder _index.json maps between them.
Agents don't overwrite. To change a position, write a new memory, mark the old status: superseded with superseded_by: <new-id>. History is free; matches how humans think.
Box search has a 10-minute indexing lag. Every memory write updates a per-folder _index.json so recall is instant. Read order:
- Known file ID → direct fetch
- Known folder + slug/title/kind/tag →
_index.json - Cross-folder → workspace-root rollup
_index.json - Business+ → Metadata Query via
search_files_keyword + mdfilters - Last resort → Box Search (with a stale-result warning)
Binaries don't get chunked, embedded, or indexed. They get a paired companion .md written by the agent that last reviewed them.
report.pdf ← binary, never touched
report.pdf.md ← companion, agent-written, describes the binary
Companion frontmatter pins the description to a specific version via companion_for.sha256. If the binary changes, the companion is stale — the plugin detects this.
On Business+, Box AI Extract Structured runs OCR for PDFs / TIFF / PNG / JPEG and fills companion fields automatically. On Personal, agents do their best with what they can locally read.
Every feature has a working path on every Box tier — Business+ just gets faster paths. Personal users still get full memory + recall + companion functionality via the index pattern.
Frontmatter team: is a hint. Folder ACLs are enforcement. The plugin creates the folder structure; you set ACLs in Box's web UI.
The plugin doesn't manage Box authentication. It invokes the Box MCP tools the user has connected — typically the official remote MCP at mcp.box.com. If your agent acts as Alice, the plugin acts as Alice. Folder ACLs are enforced naturally.
- Embeddings, RAG, vector stores — Box AI Hubs Q&A does this server-side for us when needed. We don't ship our own.
- Box auth flows — handled by the user's Box MCP.
- Sync to other vaults — Box is the system of record.
- Versioning UI — Box has native version history.
- Box Skills (the framework) — graveyard per Operational Notes.
| Variant | Repo | Backend | Best for |
|---|---|---|---|
| box-memory (this) | mrdulasolutions/BOX |
Box MCP / network | Any device with internet; full Box AI access |
| box-memory-onprem | mrdulasolutions/BOX-Onprem |
Local Box Drive filesystem | Zero outbound calls to Box; HIPAA / FedRAMP / ITAR data path |
Same workspace schema. A workspace created with one can be opened by the other.
| Failure | Plugin response |
|---|---|
| Box MCP not connected | Clear error directing to Settings → Connectors → Box at mcp.box.com |
| Filename collision (409) | Append -2, -3 etc. to filename; slug stable; index reflects |
| Index drift |
box-index-rebuild regenerates from source |
| Wikilink dangling | Treated as forward reference; not an error |
| Companion hash mismatch | Detected on read; mark stale; offer to regenerate |
| Search returns stale | Index lookup is primary; search is fallback |
| Tier downgrade | Detect on next write; switch to index-only |
| Stale OAuth after tier upgrade | Operational Notes Note 2 — reconnect MCP |
| Hub indexing warm-up | Operational Notes Note 7 — fall through to file-set Q&A |
search_files_metadata broken |
Operational Notes Note 1 — use search_files_keyword + mdfilters instead |
- Schema — the workspace, memory, companion, and index file formats
- Tier Matrix — what each Box tier unlocks
- Box AI Integration — how the AI-powered skills work
- Operational Notes — eight Box-side quirks the plugin works around
box-memory · MIT · Repo · Latest release · Issues · Air-gapped variant
Getting started
Concepts
Features
Skills reference
- Skills Reference (all)
- box-init
- box-status
- box-tier-detect
- box-mcp-check
- box-write
- box-recall
- box-ai-recall
- box-companion
- box-ai-extract
- box-team
- box-ai-agent
- box-index-rebuild
Integrations
Operations
Project