Skip to content

Curator

ssd edited this page Feb 24, 2026 · 1 revision

Curator

The Curator is an AI-powered tool that automatically processes raw inputs from your inbox/ folder into structured vault records. It runs as a background daemon, continuously watching for new files and transforming unstructured content into your vault's ontology.

Overview

Curator handles the ingestion pipeline for your Obsidian vault. It takes raw inputs like meeting transcripts, emails, voice memos, or rough notes and converts them into structured records with proper frontmatter, entity extraction, and wikilink relationships.

Key capabilities:

  • Watches the inbox/ folder for new files
  • Extracts entities (people, organizations, projects, tasks, decisions, etc.)
  • Creates structured vault records with proper templates and frontmatter
  • Automatically interlinks related entities
  • Enriches entities with contextual information
  • Runs as a background daemon with auto-restart

Architecture

4-Stage Pipeline

The Curator uses a sophisticated 4-stage pipeline (currently available with the OpenClaw backend; Claude and Zo backends use a single-call legacy mode):

Stage 1: Analyze + Create Note (LLM)

The LLM reads the inbox file and performs initial analysis:

  • Creates a comprehensive note record in the vault using alfred vault create note
  • Writes a JSON entity manifest to a temporary file listing all discovered entities
  • The manifest contains: type, name, description, and fields for each entity
  • Includes 3-attempt retry logic if the LLM fails to write the manifest file

Stage 2: Entity Resolution (Pure Python)

Pure Python logic processes the manifest:

  • Reads and parses the entity manifest
  • Normalizes entity names for consistency
  • Checks for existing records in the vault (deduplication)
  • Creates new entity records via vault_create
  • Each entity receives its type-specific template with base-view embeds (Dataview sections like Assumptions, Decisions, Tasks) automatically appended

Stage 3: Interlinking (Pure Python)

Establishes relationships between records:

  • Wires up wikilinks between the note and all resolved entities
  • Edits the note to add entity links
  • Edits each entity to add a backlink to the source note
  • Creates a fully connected knowledge graph

Stage 4: Enrich Entities (LLM, per-entity)

Per-entity enrichment calls:

  • Makes a focused LLM call for each newly created entity
  • Fills in body content with relevant information from the source material
  • Populates frontmatter fields specific to the entity type
  • Uses context from the original inbox file for accurate enrichment

Entity Extraction

The Curator extracts the following entity types:

Entities:

  • person
  • org (organization)
  • project
  • location
  • conversation
  • task
  • event

Learning Types:

  • decision
  • assumption
  • constraint

Relevance Filtering

The Curator uses intelligent relevance filtering to avoid cluttering your vault:

  • Reads user-profile.md from the vault root to understand your context
  • Only creates entities you directly interact with
  • Skips: media references, celebrities, third-party examples, subjects of analysis
  • Focuses on: people you meet, projects you work on, decisions you make

Configuration

Configure the Curator in the curator section of config.yaml:

curator:
  interval: 60  # Polling interval in seconds

agent:
  backend: openclaw  # claude | zo | openclaw
  timeout: 300       # LLM call timeout in seconds

Configuration Options

Option Type Default Description
curator.interval int 60 Polling interval for inbox watching (seconds)
agent.backend string claude AI backend to use (claude, zo, openclaw)
agent.timeout int 300 LLM call timeout in seconds

CLI Usage

Run as Foreground Daemon

alfred curator

Starts the Curator daemon in the foreground. Useful for debugging and development.

Run as Background Daemon

alfred up --only curator

Starts only the Curator as a background daemon.

alfred up

Starts all Alfred daemons (including Curator) in the background.

Batch Processing

alfred process

Processes all files in the inbox folder as a one-time batch operation.

alfred process -j 8

Parallel batch processing with 8 concurrent workers. Useful for processing large backlogs.

Stop Daemon

alfred down

Stops all running Alfred daemons.

Check Status

alfred status

Shows the status of all Alfred tools, including whether the Curator is running.

Backend Support

The Curator supports three AI backends:

Claude Code (subprocess)

Uses Claude via the claude -p CLI command in subprocess mode.

Configuration:

agent:
  backend: claude

Mode: Single-call legacy mode

Zo Computer (HTTP API)

Uses Zo's HTTP API for agent execution.

Configuration:

agent:
  backend: zo
  zo_api_url: https://api.zo.dev
  zo_api_key: ${ZO_API_KEY}

Mode: Single-call legacy mode

OpenClaw (subprocess)

Uses OpenClaw via the openclaw agent --message CLI command.

Configuration:

agent:
  backend: openclaw

Mode: Full 4-stage pipeline (recommended)

Note: The 4-stage pipeline is currently only available with the OpenClaw backend. Claude and Zo backends use a single-call legacy mode where all processing happens in one LLM invocation.

State Management

The Curator maintains state in data/curator_state.json:

  • Tracks processed file hashes to avoid re-processing
  • Records processing history and timestamps
  • Can be deleted to force re-processing of all inbox files

All vault mutations are logged to data/vault_audit.log as an append-only JSONL audit trail.

Workflow Example

  1. Drop a file into inbox/meeting-notes.md
  2. Curator detects the new file
  3. Stage 1: LLM creates a note record and entity manifest
  4. Stage 2: Python creates entity records (deduplicating against existing vault)
  5. Stage 3: Python interlinks the note with all entities
  6. Stage 4: LLM enriches each entity with contextual information
  7. Source file is marked as processed in state
  8. Your vault now contains a structured note with fully populated, interlinked entities

Scope Restrictions

The Curator operates under scope enforcement defined in vault/scope.py:

  • Can create new records
  • Can edit existing records
  • Cannot delete records
  • Restricted to entity types defined in the vault schema

This ensures the Curator cannot accidentally destroy vault data.

Clone this wiki locally