Problem or Use Case
Agents don't know which masque to don for a given task. Today, masque selection is explicit (/masques:don ) but the agent should be able to infer the right masque based on task shape and available masques.
This depends on accumulated evaluation data from the witness loop (see #1). Without performance history, selection is just keyword matching. With it, selection becomes "which masque has demonstrated competence on tasks that look like this?"
Proposed Solution
- Define task signature extraction — what features of a task/prompt are relevant for masque matching?
-Query masque performance history: "given tasks with signature X, which masques scored well?"
- Suggest or auto-don based on confidence threshold
- Fallback to no selection or explicit selection when confidence is low
- Selection should respect ring boundaries — don't auto-suggest an admin-ring masque for a task unless the context warrants elevated trust.
Open Questions
Should auto-don require user confirmation, or silent assumption?
How do we handle novel tasks with no matching history? (default masque? ask user?)
What's the cold start path before evaluation data exists?
Dependencies
Witness loop implementation (#1)
Observation storage and query interface
Problem or Use Case
Agents don't know which masque to don for a given task. Today, masque selection is explicit (/masques:don ) but the agent should be able to infer the right masque based on task shape and available masques.
This depends on accumulated evaluation data from the witness loop (see #1). Without performance history, selection is just keyword matching. With it, selection becomes "which masque has demonstrated competence on tasks that look like this?"
Proposed Solution
-Query masque performance history: "given tasks with signature X, which masques scored well?"
Open Questions
Should auto-don require user confirmation, or silent assumption?
How do we handle novel tasks with no matching history? (default masque? ask user?)
What's the cold start path before evaluation data exists?
Dependencies
Witness loop implementation (#1)
Observation storage and query interface