Meta
domain: Specorator Runtime
type: ProductRequirementsDocument
stage: revised
version: 0.4
maturity: L1
created: 2026-05-03
updated: 2026-05-10
owner: Product Owner
related_products:
- Specorator (UI)
- agentic-workflow
- agentonomous
- sim-ecs (third-party simulation kernel, internal-only)
related_issues:
- "#14 Architecture proposal v1 — agent habitat + sim-ecs + workflow package"
design_principle: "Abstract complexity from the consumer. The runtime exposes a small, easy-to-use surface; ECS, definition parsing, permission enforcement, agent lifecycle, and trace correlation are all internal."
This is the kick-off document for the project. Read it first. Everything else derives from it.
v0.3 reframe note (read first)
v0.1 modelled the runtime as a workflow interpreter that resolves a DAG.
v0.2 reframed it as an agent habitat that hosts long-lived agents.
v0.3 adds the missing piece: agentic-workflow content is fed into the runtime as an operational workflow package — agent definitions, skill definitions, command definitions, and TS scripts — that the runtime loads, instantiates, and exposes through a small set of consumer-facing verbs. The runtime is the easy-to-use surface; complexity is internal.
The headline consumer experience:
const kernel = new RuntimeKernel({ logger });
const session = await kernel.startSession({
workflow: agenticWorkflowPackage, // loaded from the user's vault
capabilities: { llm, vaultRead }, // Specorator-provided
});
kernel.bus.subscribe('*', (event) => cockpitStore.append(event));
// One-call high-level UX:
await kernel.runCommand(session.id, '/spec:requirements', { feature: 'X' });
// Or finer-grained when the UI needs it:
const agent = await kernel.spawnAgent(session.id, { definitionId: 'analyst' });
await kernel.submitTask(session.id, { prompt: '...', toAgent: agent.id });
await kernel.runScript(session.id, 'check-traceability', { feature: 'X' });
await kernel.stopSession(session.id);
Architectural detail lives in issue #14. Sections changed materially in v0.3: §1, §3, §4, §5 (added §5.6 Workflow Package), §6 (new FR-09 / FR-10 / FR-11), §7, §9, §10, §13, §14, §15, §16. Sections unchanged from v0.2: §7 NFRs (still hold), §8 SRs (with SR-06 from v0.2). North Star unchanged.
1. Executive Summary
The Specorator Runtime is the missing agent habitat in the Specorator ecosystem. It hosts a team of long-lived agentonomous agents, loads an operational workflow package sourced from agentic-workflow (agents, skills, commands, scripts), and exposes a small, easy-to-use API for the Specorator UI:
- start a session with a workflow package and a capability bundle
- submit task requests, run commands, run scripts, spawn or stop agents
- subscribe to a typed event stream
Internally the runtime instantiates agents from definitions, enforces declared per-agent permissions, drives a deterministic per-tick simulation (sim-ecs), routes commands to the right script and agent, captures outputs as proposed artifacts, and emits trace-correlated events. All of that is hidden from the consumer.
The Runtime turns the ecosystem from static methodology + manual agent use into a stateful, observable agent habitat where humans stay in control of every output.
2. Problem Statement
Currently:
- The agentic-workflow methodology is documented as Markdown templates and TS scripts, but there is no runtime that can load it as an operational package and bring agents to life from its definitions
- Agents (agentonomous) are invokable but lack a habitat with shared capabilities, dynamic spawning, declared permissions, and a unified event stream
- The agentic-workflow scripts (validation, status checks, automation) are runnable manually but cannot be invoked uniformly from a UI alongside agent tasks
- There is no cohesive "command" abstraction that ties a slash command, a script, and an agent task into one observable run
- There is no observability layer the Obsidian companion UI can subscribe to in real time
Without the runtime, the ecosystem cannot present itself as a continuous, observable, multi-agent collaboration environment driven by a single workflow package.
3. Goals
Primary Goals
- Load workflow packages — accept an agentic-workflow-shaped package (agent / skill / command definitions + TS scripts + reference content) and make all of it operational
- Host a session-scoped agent habitat — one simulation world per session, with dynamic agent spawning and stopping
- Provide a small, easy-to-use API —
startSession, runCommand, submitTask, runScript, spawnAgent, stopAgent, stopSession, plus event subscription
- Enforce per-agent permissions declared in the loaded definitions — agents can only invoke skills / commands / scripts they are allowed to use
- Provide a capability bundle (LLM, vault read, logger, clock, RNG, more as needed) supplied at session start
- Expose a typed event stream so the Specorator UI can render live state without polling
- Stay HITL-safe — runtime emits artifact proposals; never writes to the vault
Secondary Goals
- Replayability via the append-only event log
- Cross-bus traceability via shared
traceId (UI ↔ runtime ↔ each agent)
- Extensibility via additive ports (new capabilities, new agent adapters)
4. Non-Goals (V1)
- Distributed execution
- Multi-user collaboration
- Cloud-native scaling
- Persistence of session state (agentonomous handles its own agent-level snapshots)
- Complex scheduling strategies
- Workflow interpretation as a DAG — the runtime does not parse
specs// into an execution graph
- Methodology ownership — no types for stages, parallel tracks, immutability rules, EARS notation, traceability rules
- Authoring workflow packages — the runtime loads a package; it does not generate or edit one
- Replacing harness adapters — Claude Code / Codex / Copilot / editor-agents from agentic-workflow #91 remain valid execution paths
- Vault writes — runtime emits artifact proposals; Specorator owns vault I/O after HITL acceptance
- Script sandboxing — scripts run as plain Node code in V1; trust is the user's, not the runtime's, responsibility (deferred to Phase 3+)
- Inventing a permission DSL — permissions are read from the workflow package's existing definition format
- A UI, a CLI, or a server
5. Core Concepts
5.1 Session — the habitat run
A Session represents one habitat run: a simulation world hosting a configured agent population, a loaded workflow package, and a configured capability bundle.
Contains: session id and traceId, agent population (initial team + dynamically spawned), loaded workflow package, capability bundle, task inbox, command/script invocations in flight, artifact set, append-only event log.
Sessions are in-memory and ephemeral in V1.
5.2 Event-Driven Coordination
The system surfaces every habitat lifecycle change as a typed event on the runtime bus.
V1 event taxonomy: session.*, task.*, agent.*, artifact.*, command.*, script.*, llm.*. Examples:
session.started, session.idle, session.stopped
task.received, task.claimed, task.completed
agent.spawned, agent.invoked, agent.completed, agent.stopped
command.invoked, command.completed
script.invoked, script.completed
artifact.created
llm.requested, llm.responded — emitted by the PromptEngineer for every outbound LLM call
Three buses, one trace invariant unchanged: the runtime bus, Specorator's plugin bus, and each agent's internal bus all share a traceId.
5.3 Task Lifecycle
Each Task Request is a UI-issued user request:
received → claimed → in-progress → completed | failed
Multiple tasks may be in flight across the agent team simultaneously.
5.4 Agent Population and Capabilities
Agents are persistent entities in the session's simulation world. They:
- are instantiated from definitions in the loaded workflow package (not invented by the runtime)
- can be spawned and stopped dynamically during a session
- declare what they're allowed to do (skills / commands / scripts) — runtime enforces
- tick continuously, claim tasks they're suited to, and use capabilities to do their work
Capabilities (V1 set, supplied at session start, available to all agents):
- LLM — provider port shape compatible with agentonomous's
LlmProviderPort
- VaultRead — read-only access to a permission-filtered vault context (Specorator-supplied)
- Logger / WallClock / Rng — observability and determinism seams
The capability set is extensible — adding a new capability is additive (no breaking change).
5.5 Task Submission
Tasks enter the habitat through kernel.submitTask(sessionId, { prompt, inputs?, toAgent?, traceId? }). The runtime never generates tasks on its own; every task comes from the consumer.
5.6 Workflow Package (NEW in v0.3)
A Workflow Package is the operational content the consumer feeds into the runtime at session start. Its shape mirrors what agentic-workflow already ships:
- Agent definitions — typed descriptors loaded from
.claude/agents/*.md (declared role, allowed skills/commands/scripts, persona, prompt, etc.)
- Skill definitions — loaded from
.claude/skills/*.md
- Command definitions — loaded from
.claude/commands/*.md (slash commands like /spec:requirements; each declares its dispatch — which script to run, which agent role to engage, what inputs to collect)
- Scripts — TS modules from
scripts/ that the runtime can invoke on demand (validation, status checks, automation)
- Reference content — templates (
templates/), process docs (docs/), constitution (memory/constitution.md) — read-only material agents consult while working
The package is passed as data into the runtime; the runtime does not read the user's filesystem itself. Specorator (or any other consumer) is responsible for assembling the package from the user's vault and handing it to startSession. The runtime does not author, edit, or generate definitions — it consumes a package.
The Workflow Package format is owned by the runtime as a typed contract. Consumers (Specorator) ship adapters that translate between the on-disk shape (agentic-workflow Markdown + TS files) and the typed package the runtime expects. The runtime pins to a specific agentic-workflow major version through the package version field; bumps are coordinated.
5.7 Prompt Engineering and LLM Gatekeeping (NEW in v0.4)
The runtime acts as the sole gatekeeper for all outbound LLM calls. Agents never hold raw LlmCapability; instead, a PromptEngineer component intercepts every LLM call, forms a typed LLMRequest from the agent / task context, and wraps the raw provider in a LlmGateway interface that agents receive. The gatekeeper:
- Forms a structured
LLMRequest — messages array (system prompt, persona, history, task content), parameters (model, temperature, maxTokens), session / agent / task context, traceId
- Emits
llm.requested — the request is observable before it reaches the provider
- Calls
LlmCapability.complete() — the actual LLM call
- Captures the
LLMResponse — content, usage, model, duration
- Emits
llm.responded — the response is observable for tracing, cost attribution, and replay
Agents and scripts interact only through LlmGateway.complete(requestContext). The raw LlmCapability never leaves the runtime's internal component.
5.8 Request / Response as the Public Contract (NEW in v0.4)
The consumer's mental model is deliberately simple: every verb is a request; every observable output is a response.
submitTask → task.received → … → task.completed + artifact.created
runCommand → command.invoked → … → command.completed + optional artifact
runScript → script.invoked → script.completed + optional artifact
spawnAgent → agent.spawned
The internal complexity — ECS, PromptEngineer, permission guards, tick pipeline — is entirely opaque. A consumer only needs to know: pass a request in; listen for events and artifacts as the response.
6. Functional Requirements
FR-01: Session Management
- System MUST allow creating sessions with a loaded workflow package, a capability bundle, and an optional initial agent team
- System MUST allow stopping a session and verifying clean teardown (no leaked listeners)
- System MUST allow querying current session state via
getSession
FR-02: Task Submission and Inbox
- System MUST accept task requests via
submitTask(sessionId, taskRequest)
- System MUST place each request in the session's inbox in
received state
- System MUST emit
task.received events with monotonic seq ordering
- System MUST tolerate any number of concurrent in-flight tasks
- System MUST support optional
toAgent targeting (skip claim matching, dispatch directly)
FR-03: Agent Hosting and Capabilities
- System MUST host agents as long-lived entities for the session's lifetime
- System MUST drive each agent's tick pipeline on a configurable cadence
- System MUST match received tasks to agents based on declared agent skills (when no
toAgent is specified)
- System MUST pass typed
AgentContext containing inputs and capability handles to claiming agents
- System MUST capture agent outputs as
Artifact proposals
- System MUST NOT write artifacts to any persistent store
FR-04: Event System
- System MUST emit typed events for all habitat lifecycle changes
- System MUST allow subscription via
'*' or by type
- System MUST guarantee event ordering within a session via monotonic
seq
- System MUST isolate per-listener errors in both sync and async dispatch
- System MUST propagate
traceId on every event
FR-05: State Management
- System MUST maintain runtime state in-memory only (V1)
- System MUST expose state as a JSON-friendly snapshot via
getSession
- System MUST track sessions, agents, tasks, command/script invocations, artifacts, and the event log
FR-06: Simulation Kernel
- System MUST execute a deterministic per-tick pipeline driven by an embedded ECS kernel (sim-ecs)
- System MUST support a step-driven mode for tests and a continuous-loop mode for live sessions
- System MUST NOT expose ECS primitives in the public API surface
- System MUST keep per-tick stages stable and documented
FR-07: Runtime API
- System MUST expose at minimum:
startSession(opts) / stopSession(id) / getSession(id)
submitTask(id, taskRequest)
runCommand(id, commandId, args) — high-level abstraction
runScript(id, scriptId, args) — finer-grained
spawnAgent(id, agentSpec) / stopAgent(id, agentId)
- event subscription via the bus
- System MUST keep all public types re-exported from a single barrel (SR-03)
- System MUST hide ECS primitives, definition parsing, permission enforcement, and lifecycle internals from the public surface
FR-08: Observability
- System MUST provide an append-only, JSON-round-trippable event log per session
- System MUST expose task/agent/command/script states and artifact captures in real time
- System MUST enable a UI consumer to render the entire habitat without polling
FR-09: Workflow Package Loading (NEW in v0.3)
- System MUST accept a workflow package as typed input at
startSession
- System MUST parse declared agents, skills, commands, scripts, and reference content
- System MUST validate the package on load and surface validation errors as a typed
Result
- System MUST NOT read the user's filesystem directly — the consumer supplies the package as data
- System MUST NOT modify or generate package contents
FR-10: Command Dispatch (NEW in v0.3)
- System MUST expose
runCommand(sessionId, commandId, args) as the high-level dispatch verb
- System MUST resolve a command's declared handler from the loaded package (script, agent task, or both)
- System MUST emit
command.invoked and command.completed events bracketing the run, with results captured as artifacts where applicable
- System MUST enforce per-agent permission when a command engages an agent — disallowed dispatch returns a typed
PermissionDeniedError Result
- System MUST allow the same command to be invoked concurrently with different args
FR-11: Script Execution (NEW in v0.3)
- System MUST expose
runScript(sessionId, scriptId, args) for finer-grained script invocation
- System MUST execute scripts as plain Node modules in V1 — no sandbox
- System MUST emit
script.invoked and script.completed events with structured results
- System MUST capture script output as an
Artifact when the script declares it produces one
- System MUST NOT itself write script results to the vault — output flows through events for HITL
FR-12: Prompt Engineering and LLM Gatekeeping (NEW in v0.4)
- System MUST gate all outbound LLM calls through a
PromptEngineer component
- System MUST form a typed
LLMRequest before every LLM call, capturing sessionId, agentId, taskId, messages, parameters, and traceId
- System MUST capture a typed
LLMResponse for every call, including content, model, usage, and durationMs
- System MUST emit
llm.requested and llm.responded events for every LLM interaction
- System MUST expose
LlmGateway (not raw LlmCapability) to agents and scripts
- System MUST NOT allow agents or scripts to call the LLM provider directly
7. Non-Functional Requirements
(unchanged from v0.2; NFR-01 through NFR-06 still hold)
8. Solution Requirements
(unchanged from v0.2; SR-01 through SR-06 still hold. SR-06 Cross-package shape parity added in v0.2 is retained.)
9. Architecture Overview
| Component |
Responsibility |
| Runtime Kernel |
The verb-bearing public façade — startSession, submitTask, runCommand, runScript, spawnAgent, stopAgent, stopSession, getSession |
| Workflow Package Loader |
Validates and registers agent / skill / command / script definitions and reference content from a supplied package |
| Command Registry |
Resolves a command id to its declared dispatch (script + agent role + inputs) and orchestrates execution |
| Script Runner |
Invokes scripts from the loaded package as plain Node modules; captures structured results |
| Agent Adapter |
Bridges agentonomous (and other) agents into ECS entity bundles with declared skill/permission components; runs agent.tick() autonomously and agent.receive() on task claim |
| Permission Guard |
Enforces declared per-agent permissions at command/skill/script dispatch |
| Simulation Kernel |
sim-ecs-backed per-tick scheduler; runs intake → claim → invoke → capture → emit stages; not exposed in public API |
| Prompt Engineer |
Intercepts every outbound LLM call; forms a typed LLMRequest; emits llm.requested / llm.responded; wraps raw LlmCapability as LlmGateway exposed to agents |
| Capability Provider |
Hosts capability ports (LLM, VaultRead, Logger, Clock, Rng, ...) as world-global resources |
| Task Inbox |
Holds received task requests until claimed |
| Event Bus |
Pub/sub for out-of-world observers; trace-correlated with Specorator's bus and each agent's bus |
| State Snapshot |
JSON-friendly read of the world for getSession |
v0.3 adds the Loader, Command Registry, Script Runner, and Permission Guard. v0.4 adds the Prompt Engineer. v0.2 retired the literal "Workflow Interpreter" component (no DAG to parse).
10. Example Flows
10.1 Task flow
UI: kernel.submitTask(sessionId, { prompt: "draft requirements for feature X", traceId })
→ task.received (task entity created in the inbox)
→ task.claimed (analyst agent matches the request to its declared skills)
→ agent.invoked
→ (agent uses LLM + reference content from the package)
→ agent.completed
→ artifact.created (proposal captured)
→ task.completed
10.2 Command flow (NEW in v0.3)
UI: kernel.runCommand(sessionId, '/spec:requirements', { feature: 'X' })
→ command.invoked (command resolved from the loaded package)
→ script.invoked (declared dispatch runs scripts/check-feature-readiness.ts)
→ script.completed
→ task.received (declared dispatch creates an analyst-targeted task)
→ task.claimed
→ agent.invoked
→ agent.completed
→ artifact.created
→ task.completed
→ command.completed
10.3 Direct script flow
UI: kernel.runScript(sessionId, 'check-traceability', { feature: 'X' })
→ script.invoked
→ script.completed (structured result on the event payload; optional Artifact)
In every flow, artifacts are proposals. Specorator (not the runtime) decides whether and how to apply them to the vault.
11. Integration Points
Specorator (UI)
- Loads the agentic-workflow package from the user's vault and assembles the typed Workflow Package for
startSession
- Calls
runCommand / submitTask / runScript / spawnAgent / stopAgent from the cockpit UI
- Subscribes to runtime events for cockpit rendering
- Owns vault I/O after HITL acceptance
- Supplies the
VaultRead capability bridge (permission-filtered)
agentic-workflow
- Source of the workflow package's contents — agent / skill / command definitions, scripts, templates, docs
- Consumed as data through Specorator's package adapter
- NOT executed by the runtime as a DAG
agentonomous
- Source of agent shape, skills, ticking, persona, lifecycle, snapshot, LLM provider port
- Hosted by the runtime as ECS entities via a thin adapter
- Instantiated from agent definitions in the loaded workflow package
sim-ecs (third-party, internal-only)
- Embedded simulation kernel
- Not exposed in the public API surface
- Version-coordinated with agentonomous's optional peer dep range
12. UX Entry Points
12.1 Start Session
User opens a workflow context in Specorator → Specorator assembles the Workflow Package and calls startSession
12.2 Run Command (high-level)
User invokes a slash command from the cockpit (e.g., /spec:requirements) → Specorator calls runCommand → runtime orchestrates script + agent task → events stream back
12.3 Submit Task / Run Script (finer-grained)
For scenarios that don't map to a single declared command, the UI uses submitTask (free-form prompt) or runScript (deterministic operation)
12.4 Spawn / Stop Agent
The user (or the cockpit, programmatically) brings agents into and out of the session as the work progresses
12.5 Inspect and Decide (HITL)
User reviews proposed artifacts → accepts (Specorator writes to vault), edits, refines (resubmits), or rejects
12.6 Inspect Results
User browses past sessions: artifacts, command/script logs, full execution trace
13. Acceptance Criteria
14. Risks
(refreshed for v0.3)
- Over-engineering too early
- Tight coupling between runtime and UI
- Unclear event schema (mitigated by typed discriminated union)
- Uncontrolled async behavior (mitigated by per-listener error isolation)
- npm module format incompatibility with Obsidian (mitigated — Vite handles ESM→CJS)
- Public API surface instability
- PRD §9 reinterpretation contested (carried from v0.2)
- sim-ecs version drift with agentonomous (carried from v0.2)
- HITL invariant violation (carried from v0.2)
- Workflow package format coupling — if agentic-workflow's
.claude/agents/ Markdown shape changes substantively, Specorator's package adapter (not the runtime) absorbs the change. Mitigated by versioning the package format. (NEW in v0.3)
- Permission semantics drift — if the agent definition Markdown evolves new permission concepts, the runtime's Permission Guard must follow. Mitigated by declaring the runtime's understood permission shape in the package contract. (NEW in v0.3)
- Untrusted scripts — scripts run as plain Node in V1. If the runtime is ever embedded in a multi-user / untrusted context, sandboxing becomes mandatory (Phase 3+ concern). (NEW in v0.3)
- API minimalism vs power —
runCommand may be too coarse for some UI flows. Mitigated by also exposing submitTask / runScript / spawnAgent for finer control. (NEW in v0.3)
15. Delivery Plan (V1)
| Phase |
Contents |
| Phase 1 — Core Skeleton |
Runtime kernel; embedded sim-ecs world; event bus; session model; capability ports (LLM, VaultRead, Logger, Clock, Rng); workflow package loader (validation + registration); PromptEngineer + LlmGateway (LLM gatekeeper); stub agent adapter; startSession / stopSession / submitTask / getSession |
| Phase 2 — Live Agents and Commands |
Agentonomous adapter; full task→claim→tick→artifact pipeline; dynamic spawnAgent / stopAgent; runCommand / runScript with declared dispatch; permission enforcement; task.failed / agent.failed / command.failed lifecycles |
| Phase 3 — Observability |
Trace correlation, structured logging, kernel.replay(eventLog), listener-leak tripwire, session snapshot ergonomics |
| Phase 4 — Integration |
Connect to Specorator v2.0 and the actual agentic-workflow package adapter Specorator ships; production-ready capability adapters; co-test the full ecosystem story |
16. Open Questions
(most resolved by issue #14; remaining open items below)
Resolved by #14:
Should sessions be persisted to filesystem? → No; agentonomous handles its own snapshots
How to model retries and failures? → Reserved event types; Phase 2
How to represent artifacts (files vs memory)? → In-memory proposals only; Specorator owns vault writes
How strict should event schemas be? → Typed discriminated union; JSON-round-trippable
Module format? → ESM-only
Error handling strategy? → Result at the public API boundary; shape parity with agentonomous + Specorator #104
How does the runtime consume agentic-workflow? → As a typed Workflow Package (operational definitions + scripts + reference content), assembled by the consumer (Specorator) and passed in at startSession
How are permissions modelled? → Read from the existing agent definition Markdown; runtime enforces at dispatch
Are scripts sandboxed? → No (V1); trusted Node execution; sandbox is a Phase 3+ concern
Still open (handed to Workshop A in #4):
- Orchestrator Engine scope — the runtime owns and manages an orchestrator engine that tracks tasks across all running autonomous agents. Spec in design; will be attached as a dedicated issue. The orchestrator engine is in-scope for the runtime but not yet specified.
- The exact typed shape of the Workflow Package contract (agent definition fields, command dispatch declaration, etc.)
- Cross-bus
traceId lifecycle confirmation
- Complete capability surface for V1 — LLM / VaultRead / Logger / Clock / Rng sufficient, or does the agentic-workflow content imply we need more (e.g.,
WebFetch, MemoryStore)?
- agentic-workflow major version pinning for V1
- sim-ecs version coordination with agentonomous
- Cockpit subscription pattern — one bus or two?
Critical North Star
This runtime must remain a knowledge-work agent habitat — not a generic agent platform, not a workflow engine. Every decision should be tested against:
"Does this help humans understand, control, and evolve knowledge work?"
Path Forward
| Step |
Issue |
What it produces |
| 1 |
#2 — Product presence |
README, VISION.md, product page (delivered in PR #13) |
| 2 |
#3 — Project environment |
Repo skeleton, CI, labels, workshops |
| 2b |
#14 — Architecture proposal |
Detailed agent-habitat architecture (v0.7 reflects v0.3 PRD) |
| 3 |
#4 — Design consolidation |
Solution proposal, enriched docs, ADRs (consumes #14) |
| 4 |
#5 — Baseline v0.0.1 |
Frozen pre-engineering document release |
| 5 |
#6 — Formal initiation |
p3.express Group A artifacts, v0.0.2 |
Steps 1, 2, 2b can run in parallel. Steps 3–5 are sequential.
Changelog
- v0.4 (2026-05-10): Added §5.7 Prompt Engineering and LLM Gatekeeping — runtime is the sole gatekeeper for all outbound LLM calls;
PromptEngineer forms LLMRequest, emits llm.requested / llm.responded, exposes LlmGateway to agents. Added §5.8 Request / Response as the Public Contract mental model. Added FR-12 (Prompt Engineering and LLM Gatekeeping). Added Prompt Engineer row to §9 Architecture Overview. Updated §5.2 event taxonomy to include llm.*. Added PromptEngineer + LlmGateway to Phase 1 in §15. Added Orchestrator Engine open question to §16. Meta: version 0.4, updated 2026-05-10.
- v0.3 (2026-05-04): Added the Workflow Package concept — agentic-workflow content is loaded as an operational package (agent / skill / command definitions + TS scripts + reference content), not just consulted as reference. Added FR-09 (package loading), FR-10 (command dispatch), FR-11 (script execution). Expanded §7 Runtime API with
runCommand, runScript, spawnAgent, stopAgent. Added §9 components: Workflow Package Loader, Command Registry, Script Runner, Permission Guard. Added §10.2 Command flow and §10.3 Direct script flow. Permissions clarified: read from existing agent definition Markdown — no new DSL. Scripts trusted in V1 — no sandbox. New design principle codified in the meta block: abstract complexity from the consumer. Acceptance criteria expanded to include public-API surface review.
- v0.2 (2026-05-04): Reframed runtime as agent habitat, not workflow interpreter. Reshaped §1, §3, §4, §5, §6, §9, §10. Added SR-06 cross-package shape parity. Detailed in issue #14.
- v0.1 (2026-05-03): Initial PRD.
Meta
v0.3 reframe note (read first)
v0.1 modelled the runtime as a workflow interpreter that resolves a DAG.
v0.2 reframed it as an agent habitat that hosts long-lived agents.
v0.3 adds the missing piece: agentic-workflow content is fed into the runtime as an operational workflow package — agent definitions, skill definitions, command definitions, and TS scripts — that the runtime loads, instantiates, and exposes through a small set of consumer-facing verbs. The runtime is the easy-to-use surface; complexity is internal.
The headline consumer experience:
Architectural detail lives in issue #14. Sections changed materially in v0.3: §1, §3, §4, §5 (added §5.6 Workflow Package), §6 (new FR-09 / FR-10 / FR-11), §7, §9, §10, §13, §14, §15, §16. Sections unchanged from v0.2: §7 NFRs (still hold), §8 SRs (with SR-06 from v0.2). North Star unchanged.
1. Executive Summary
The Specorator Runtime is the missing agent habitat in the Specorator ecosystem. It hosts a team of long-lived agentonomous agents, loads an operational workflow package sourced from
agentic-workflow(agents, skills, commands, scripts), and exposes a small, easy-to-use API for the Specorator UI:Internally the runtime instantiates agents from definitions, enforces declared per-agent permissions, drives a deterministic per-tick simulation (sim-ecs), routes commands to the right script and agent, captures outputs as proposed artifacts, and emits trace-correlated events. All of that is hidden from the consumer.
The Runtime turns the ecosystem from static methodology + manual agent use into a stateful, observable agent habitat where humans stay in control of every output.
2. Problem Statement
Currently:
3. Goals
Primary Goals
startSession,runCommand,submitTask,runScript,spawnAgent,stopAgent,stopSession, plus event subscriptionSecondary Goals
traceId(UI ↔ runtime ↔ each agent)4. Non-Goals (V1)
specs//into an execution graph5. Core Concepts
5.1 Session — the habitat run
A Session represents one habitat run: a simulation world hosting a configured agent population, a loaded workflow package, and a configured capability bundle.
Contains: session id and
traceId, agent population (initial team + dynamically spawned), loaded workflow package, capability bundle, task inbox, command/script invocations in flight, artifact set, append-only event log.Sessions are in-memory and ephemeral in V1.
5.2 Event-Driven Coordination
The system surfaces every habitat lifecycle change as a typed event on the runtime bus.
V1 event taxonomy:
session.*,task.*,agent.*,artifact.*,command.*,script.*,llm.*. Examples:session.started,session.idle,session.stoppedtask.received,task.claimed,task.completedagent.spawned,agent.invoked,agent.completed,agent.stoppedcommand.invoked,command.completedscript.invoked,script.completedartifact.createdllm.requested,llm.responded— emitted by thePromptEngineerfor every outbound LLM callThree buses, one trace invariant unchanged: the runtime bus, Specorator's plugin bus, and each agent's internal bus all share a
traceId.5.3 Task Lifecycle
Each Task Request is a UI-issued user request:
Multiple tasks may be in flight across the agent team simultaneously.
5.4 Agent Population and Capabilities
Agents are persistent entities in the session's simulation world. They:
Capabilities (V1 set, supplied at session start, available to all agents):
LlmProviderPortThe capability set is extensible — adding a new capability is additive (no breaking change).
5.5 Task Submission
Tasks enter the habitat through
kernel.submitTask(sessionId, { prompt, inputs?, toAgent?, traceId? }). The runtime never generates tasks on its own; every task comes from the consumer.5.6 Workflow Package (NEW in v0.3)
A Workflow Package is the operational content the consumer feeds into the runtime at session start. Its shape mirrors what agentic-workflow already ships:
.claude/agents/*.md(declared role, allowed skills/commands/scripts, persona, prompt, etc.).claude/skills/*.md.claude/commands/*.md(slash commands like/spec:requirements; each declares its dispatch — which script to run, which agent role to engage, what inputs to collect)scripts/that the runtime can invoke on demand (validation, status checks, automation)templates/), process docs (docs/), constitution (memory/constitution.md) — read-only material agents consult while workingThe package is passed as data into the runtime; the runtime does not read the user's filesystem itself. Specorator (or any other consumer) is responsible for assembling the package from the user's vault and handing it to
startSession. The runtime does not author, edit, or generate definitions — it consumes a package.The Workflow Package format is owned by the runtime as a typed contract. Consumers (Specorator) ship adapters that translate between the on-disk shape (agentic-workflow Markdown + TS files) and the typed package the runtime expects. The runtime pins to a specific agentic-workflow major version through the package version field; bumps are coordinated.
5.7 Prompt Engineering and LLM Gatekeeping (NEW in v0.4)
The runtime acts as the sole gatekeeper for all outbound LLM calls. Agents never hold raw
LlmCapability; instead, aPromptEngineercomponent intercepts every LLM call, forms a typedLLMRequestfrom the agent / task context, and wraps the raw provider in aLlmGatewayinterface that agents receive. The gatekeeper:LLMRequest— messages array (system prompt, persona, history, task content), parameters (model, temperature, maxTokens), session / agent / task context,traceIdllm.requested— the request is observable before it reaches the providerLlmCapability.complete()— the actual LLM callLLMResponse— content, usage, model, durationllm.responded— the response is observable for tracing, cost attribution, and replayAgents and scripts interact only through
LlmGateway.complete(requestContext). The rawLlmCapabilitynever leaves the runtime's internal component.5.8 Request / Response as the Public Contract (NEW in v0.4)
The consumer's mental model is deliberately simple: every verb is a request; every observable output is a response.
submitTask→task.received→ … →task.completed+artifact.createdrunCommand→command.invoked→ … →command.completed+ optional artifactrunScript→script.invoked→script.completed+ optional artifactspawnAgent→agent.spawnedThe internal complexity — ECS, PromptEngineer, permission guards, tick pipeline — is entirely opaque. A consumer only needs to know: pass a request in; listen for events and artifacts as the response.
6. Functional Requirements
FR-01: Session Management
getSessionFR-02: Task Submission and Inbox
submitTask(sessionId, taskRequest)receivedstatetask.receivedevents with monotonicseqorderingtoAgenttargeting (skip claim matching, dispatch directly)FR-03: Agent Hosting and Capabilities
toAgentis specified)AgentContextcontaining inputs and capability handles to claiming agentsArtifactproposalsFR-04: Event System
'*'or bytypeseqtraceIdon every eventFR-05: State Management
getSessionFR-06: Simulation Kernel
FR-07: Runtime API
startSession(opts)/stopSession(id)/getSession(id)submitTask(id, taskRequest)runCommand(id, commandId, args)— high-level abstractionrunScript(id, scriptId, args)— finer-grainedspawnAgent(id, agentSpec)/stopAgent(id, agentId)FR-08: Observability
FR-09: Workflow Package Loading (NEW in v0.3)
startSessionResultFR-10: Command Dispatch (NEW in v0.3)
runCommand(sessionId, commandId, args)as the high-level dispatch verbcommand.invokedandcommand.completedevents bracketing the run, with results captured as artifacts where applicablePermissionDeniedErrorResultFR-11: Script Execution (NEW in v0.3)
runScript(sessionId, scriptId, args)for finer-grained script invocationscript.invokedandscript.completedevents with structured resultsArtifactwhen the script declares it produces oneFR-12: Prompt Engineering and LLM Gatekeeping (NEW in v0.4)
PromptEngineercomponentLLMRequestbefore every LLM call, capturing sessionId, agentId, taskId, messages, parameters, and traceIdLLMResponsefor every call, including content, model, usage, and durationMsllm.requestedandllm.respondedevents for every LLM interactionLlmGateway(not rawLlmCapability) to agents and scripts7. Non-Functional Requirements
(unchanged from v0.2; NFR-01 through NFR-06 still hold)
8. Solution Requirements
(unchanged from v0.2; SR-01 through SR-06 still hold. SR-06 Cross-package shape parity added in v0.2 is retained.)
9. Architecture Overview
startSession,submitTask,runCommand,runScript,spawnAgent,stopAgent,stopSession,getSessionagent.tick()autonomously andagent.receive()on task claimintake → claim → invoke → capture → emitstages; not exposed in public APILLMRequest; emitsllm.requested/llm.responded; wraps rawLlmCapabilityasLlmGatewayexposed to agentsgetSession10. Example Flows
10.1 Task flow
10.2 Command flow (NEW in v0.3)
10.3 Direct script flow
In every flow, artifacts are proposals. Specorator (not the runtime) decides whether and how to apply them to the vault.
11. Integration Points
Specorator (UI)
startSessionrunCommand/submitTask/runScript/spawnAgent/stopAgentfrom the cockpit UIVaultReadcapability bridge (permission-filtered)agentic-workflow
agentonomous
sim-ecs (third-party, internal-only)
12. UX Entry Points
12.1 Start Session
User opens a workflow context in Specorator → Specorator assembles the Workflow Package and calls
startSession12.2 Run Command (high-level)
User invokes a slash command from the cockpit (e.g.,
/spec:requirements) → Specorator callsrunCommand→ runtime orchestrates script + agent task → events stream back12.3 Submit Task / Run Script (finer-grained)
For scenarios that don't map to a single declared command, the UI uses
submitTask(free-form prompt) orrunScript(deterministic operation)12.4 Spawn / Stop Agent
The user (or the cockpit, programmatically) brings agents into and out of the session as the work progresses
12.5 Inspect and Decide (HITL)
User reviews proposed artifacts → accepts (Specorator writes to vault), edits, refines (resubmits), or rejects
12.6 Inspect Results
User browses past sessions: artifacts, command/script logs, full execution trace
13. Acceptance Criteria
submitTaskand reachtask.receivedstaterunCommandand resolve to their declared dispatch (script + agent task)runScriptand emitscript.completedwith structured resultsspawnAgentand stopped viastopAgentmid-sessionPermissionDeniedErrorResultfor malformed packagesseqand sharedtraceIdnpm install specorator-runtimeinstalls the package and its TypeScript typesimport { RuntimeKernel } from 'specorator-runtime'works without deep or path-based importsnpm run verifypasses with zero failures and zeroanyorts-ignorein production codeagentic-workflowtemplate14. Risks
(refreshed for v0.3)
.claude/agents/Markdown shape changes substantively, Specorator's package adapter (not the runtime) absorbs the change. Mitigated by versioning the package format. (NEW in v0.3)runCommandmay be too coarse for some UI flows. Mitigated by also exposingsubmitTask/runScript/spawnAgentfor finer control. (NEW in v0.3)15. Delivery Plan (V1)
PromptEngineer+LlmGateway(LLM gatekeeper); stub agent adapter;startSession/stopSession/submitTask/getSessionspawnAgent/stopAgent;runCommand/runScriptwith declared dispatch; permission enforcement;task.failed/agent.failed/command.failedlifecycleskernel.replay(eventLog), listener-leak tripwire, session snapshot ergonomics16. Open Questions
(most resolved by issue #14; remaining open items below)
Resolved by #14:
Should sessions be persisted to filesystem?→ No; agentonomous handles its own snapshotsHow to model retries and failures?→ Reserved event types; Phase 2How to represent artifacts (files vs memory)?→ In-memory proposals only; Specorator owns vault writesHow strict should event schemas be?→ Typed discriminated union; JSON-round-trippableModule format?→ ESM-onlyError handling strategy?→Resultat the public API boundary; shape parity with agentonomous + Specorator #104How does the runtime consume agentic-workflow?→ As a typed Workflow Package (operational definitions + scripts + reference content), assembled by the consumer (Specorator) and passed in atstartSessionHow are permissions modelled?→ Read from the existing agent definition Markdown; runtime enforces at dispatchAre scripts sandboxed?→ No (V1); trusted Node execution; sandbox is a Phase 3+ concernStill open (handed to Workshop A in #4):
traceIdlifecycle confirmationWebFetch,MemoryStore)?Critical North Star
This runtime must remain a knowledge-work agent habitat — not a generic agent platform, not a workflow engine. Every decision should be tested against:
Path Forward
Steps 1, 2, 2b can run in parallel. Steps 3–5 are sequential.
Changelog
PromptEngineerformsLLMRequest, emitsllm.requested/llm.responded, exposesLlmGatewayto agents. Added §5.8 Request / Response as the Public Contract mental model. Added FR-12 (Prompt Engineering and LLM Gatekeeping). AddedPrompt Engineerrow to §9 Architecture Overview. Updated §5.2 event taxonomy to includellm.*. AddedPromptEngineer+LlmGatewayto Phase 1 in §15. Added Orchestrator Engine open question to §16. Meta: version 0.4, updated 2026-05-10.runCommand,runScript,spawnAgent,stopAgent. Added §9 components: Workflow Package Loader, Command Registry, Script Runner, Permission Guard. Added §10.2 Command flow and §10.3 Direct script flow. Permissions clarified: read from existing agent definition Markdown — no new DSL. Scripts trusted in V1 — no sandbox. New design principle codified in the meta block: abstract complexity from the consumer. Acceptance criteria expanded to include public-API surface review.