Curator

The Curator is an AI-powered tool that automatically processes raw inputs from your inbox/ folder into structured vault records. It runs as a background daemon, continuously watching for new files and transforming unstructured content into your vault's ontology.

Overview

Curator handles the ingestion pipeline for your Obsidian vault. It takes raw inputs like meeting transcripts, emails, voice memos, or rough notes and converts them into structured records with proper frontmatter, entity extraction, and wikilink relationships.

Key capabilities:

Watches the inbox/ folder for new files
Extracts entities (people, organizations, projects, tasks, decisions, etc.)
Creates structured vault records with proper templates and frontmatter
Automatically interlinks related entities
Enriches entities with contextual information
Runs as a background daemon with auto-restart

Architecture

4-Stage Pipeline

The Curator uses a sophisticated 4-stage pipeline (currently available with the OpenClaw backend; Claude and Zo backends use a single-call legacy mode):

Stage 1: Analyze + Create Note (LLM)

The LLM reads the inbox file and performs initial analysis:

Creates a comprehensive note record in the vault using alfred vault create note
Writes a JSON entity manifest to a temporary file listing all discovered entities
The manifest contains: type, name, description, and fields for each entity
Includes 3-attempt retry logic if the LLM fails to write the manifest file

Stage 2: Entity Resolution (Pure Python)

Pure Python logic processes the manifest:

Reads and parses the entity manifest
Normalizes entity names for consistency
Checks for existing records in the vault (deduplication)
Creates new entity records via vault_create
Each entity receives its type-specific template with base-view embeds (Dataview sections like Assumptions, Decisions, Tasks) automatically appended

Stage 3: Interlinking (Pure Python)

Establishes relationships between records:

Wires up wikilinks between the note and all resolved entities
Edits the note to add entity links
Edits each entity to add a backlink to the source note
Creates a fully connected knowledge graph

Stage 4: Enrich Entities (LLM, per-entity)

Per-entity enrichment calls:

Makes a focused LLM call for each newly created entity
Fills in body content with relevant information from the source material
Populates frontmatter fields specific to the entity type
Uses context from the original inbox file for accurate enrichment

Entity Extraction

The Curator extracts the following entity types:

Entities:

person
org (organization)
project
location
conversation
task
event

Learning Types:

decision
assumption
constraint

Relevance Filtering

The Curator uses intelligent relevance filtering to avoid cluttering your vault:

Reads user-profile.md from the vault root to understand your context
Only creates entities you directly interact with
Skips: media references, celebrities, third-party examples, subjects of analysis
Focuses on: people you meet, projects you work on, decisions you make

Configuration

Configure the Curator in the curator section of config.yaml:

curator:
  interval: 60  # Polling interval in seconds

agent:
  backend: openclaw  # claude | zo | openclaw
  timeout: 300       # LLM call timeout in seconds

Configuration Options

Option	Type	Default	Description
`curator.interval`	int	60	Polling interval for inbox watching (seconds)
`agent.backend`	string	claude	AI backend to use (claude, zo, openclaw)
`agent.timeout`	int	300	LLM call timeout in seconds

CLI Usage

Run as Foreground Daemon

alfred curator

Starts the Curator daemon in the foreground. Useful for debugging and development.

Run as Background Daemon

alfred up --only curator

Starts only the Curator as a background daemon.

alfred up

Starts all Alfred daemons (including Curator) in the background.

Batch Processing

alfred process

Processes all files in the inbox folder as a one-time batch operation.

alfred process -j 8

Parallel batch processing with 8 concurrent workers. Useful for processing large backlogs.

Stop Daemon

alfred down

Stops all running Alfred daemons.

Check Status

alfred status

Shows the status of all Alfred tools, including whether the Curator is running.

Backend Support

The Curator supports three AI backends:

Claude Code (subprocess)

Uses Claude via the claude -p CLI command in subprocess mode.

Configuration:

agent:
  backend: claude

Mode: Single-call legacy mode

Zo Computer (HTTP API)

Uses Zo's HTTP API for agent execution.

Configuration:

agent:
  backend: zo
  zo_api_url: https://api.zo.dev
  zo_api_key: ${ZO_API_KEY}

Mode: Single-call legacy mode

OpenClaw (subprocess)

Uses OpenClaw via the openclaw agent --message CLI command.

Configuration:

agent:
  backend: openclaw

Mode: Full 4-stage pipeline (recommended)

Note: The 4-stage pipeline is currently only available with the OpenClaw backend. Claude and Zo backends use a single-call legacy mode where all processing happens in one LLM invocation.

State Management

The Curator maintains state in data/curator_state.json:

Tracks processed file hashes to avoid re-processing
Records processing history and timestamps
Can be deleted to force re-processing of all inbox files

All vault mutations are logged to data/vault_audit.log as an append-only JSONL audit trail.

Workflow Example

Drop a file into inbox/meeting-notes.md
Curator detects the new file
Stage 1: LLM creates a note record and entity manifest
Stage 2: Python creates entity records (deduplicating against existing vault)
Stage 3: Python interlinks the note with all entities
Stage 4: LLM enriches each entity with contextual information
Source file is marked as processed in state
Your vault now contains a structured note with fully populated, interlinked entities

Scope Restrictions

The Curator operates under scope enforcement defined in vault/scope.py:

Can create new records
Can edit existing records
Cannot delete records
Restricted to entity types defined in the vault schema

This ensures the Curator cannot accidentally destroy vault data.

Getting Started

Architecture

Workers

Reference

Curator

Curator

Overview

Architecture

4-Stage Pipeline

Stage 1: Analyze + Create Note (LLM)

Stage 2: Entity Resolution (Pure Python)

Stage 3: Interlinking (Pure Python)

Stage 4: Enrich Entities (LLM, per-entity)

Entity Extraction

Relevance Filtering

Configuration

Configuration Options

CLI Usage

Run as Foreground Daemon

Run as Background Daemon

Batch Processing

Stop Daemon

Check Status

Backend Support

Claude Code (subprocess)

Zo Computer (HTTP API)

OpenClaw (subprocess)

State Management

Workflow Example

Scope Restrictions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally