-
Notifications
You must be signed in to change notification settings - Fork 26
Curator
The Curator is an AI-powered tool that automatically processes raw inputs from your inbox/ folder into structured vault records. It runs as a background daemon, continuously watching for new files and transforming unstructured content into your vault's ontology.
Curator handles the ingestion pipeline for your Obsidian vault. It takes raw inputs like meeting transcripts, emails, voice memos, or rough notes and converts them into structured records with proper frontmatter, entity extraction, and wikilink relationships.
Key capabilities:
- Watches the
inbox/folder for new files - Extracts entities (people, organizations, projects, tasks, decisions, etc.)
- Creates structured vault records with proper templates and frontmatter
- Automatically interlinks related entities
- Enriches entities with contextual information
- Runs as a background daemon with auto-restart
The Curator uses a sophisticated 4-stage pipeline (currently available with the OpenClaw backend; Claude and Zo backends use a single-call legacy mode):
The LLM reads the inbox file and performs initial analysis:
- Creates a comprehensive note record in the vault using
alfred vault create note - Writes a JSON entity manifest to a temporary file listing all discovered entities
- The manifest contains: type, name, description, and fields for each entity
- Includes 3-attempt retry logic if the LLM fails to write the manifest file
Pure Python logic processes the manifest:
- Reads and parses the entity manifest
- Normalizes entity names for consistency
- Checks for existing records in the vault (deduplication)
- Creates new entity records via
vault_create - Each entity receives its type-specific template with base-view embeds (Dataview sections like Assumptions, Decisions, Tasks) automatically appended
Establishes relationships between records:
- Wires up wikilinks between the note and all resolved entities
- Edits the note to add entity links
- Edits each entity to add a backlink to the source note
- Creates a fully connected knowledge graph
Per-entity enrichment calls:
- Makes a focused LLM call for each newly created entity
- Fills in body content with relevant information from the source material
- Populates frontmatter fields specific to the entity type
- Uses context from the original inbox file for accurate enrichment
The Curator extracts the following entity types:
Entities:
- person
- org (organization)
- project
- location
- conversation
- task
- event
Learning Types:
- decision
- assumption
- constraint
The Curator uses intelligent relevance filtering to avoid cluttering your vault:
- Reads
user-profile.mdfrom the vault root to understand your context - Only creates entities you directly interact with
- Skips: media references, celebrities, third-party examples, subjects of analysis
- Focuses on: people you meet, projects you work on, decisions you make
Configure the Curator in the curator section of config.yaml:
curator:
interval: 60 # Polling interval in seconds
agent:
backend: openclaw # claude | zo | openclaw
timeout: 300 # LLM call timeout in seconds| Option | Type | Default | Description |
|---|---|---|---|
curator.interval |
int | 60 | Polling interval for inbox watching (seconds) |
agent.backend |
string | claude | AI backend to use (claude, zo, openclaw) |
agent.timeout |
int | 300 | LLM call timeout in seconds |
alfred curatorStarts the Curator daemon in the foreground. Useful for debugging and development.
alfred up --only curatorStarts only the Curator as a background daemon.
alfred upStarts all Alfred daemons (including Curator) in the background.
alfred processProcesses all files in the inbox folder as a one-time batch operation.
alfred process -j 8Parallel batch processing with 8 concurrent workers. Useful for processing large backlogs.
alfred downStops all running Alfred daemons.
alfred statusShows the status of all Alfred tools, including whether the Curator is running.
The Curator supports three AI backends:
Uses Claude via the claude -p CLI command in subprocess mode.
Configuration:
agent:
backend: claudeMode: Single-call legacy mode
Uses Zo's HTTP API for agent execution.
Configuration:
agent:
backend: zo
zo_api_url: https://api.zo.dev
zo_api_key: ${ZO_API_KEY}Mode: Single-call legacy mode
Uses OpenClaw via the openclaw agent --message CLI command.
Configuration:
agent:
backend: openclawMode: Full 4-stage pipeline (recommended)
Note: The 4-stage pipeline is currently only available with the OpenClaw backend. Claude and Zo backends use a single-call legacy mode where all processing happens in one LLM invocation.
The Curator maintains state in data/curator_state.json:
- Tracks processed file hashes to avoid re-processing
- Records processing history and timestamps
- Can be deleted to force re-processing of all inbox files
All vault mutations are logged to data/vault_audit.log as an append-only JSONL audit trail.
- Drop a file into
inbox/meeting-notes.md - Curator detects the new file
- Stage 1: LLM creates a note record and entity manifest
- Stage 2: Python creates entity records (deduplicating against existing vault)
- Stage 3: Python interlinks the note with all entities
- Stage 4: LLM enriches each entity with contextual information
- Source file is marked as processed in state
- Your vault now contains a structured note with fully populated, interlinked entities
The Curator operates under scope enforcement defined in vault/scope.py:
- Can create new records
- Can edit existing records
- Cannot delete records
- Restricted to entity types defined in the vault schema
This ensures the Curator cannot accidentally destroy vault data.
Getting Started
Architecture
Workers
Reference