Phase 0 · PRD — Specorator Runtime

## Meta

```yaml
domain: Specorator Runtime
type: ProductRequirementsDocument
stage: revised
version: 0.4
maturity: L1
created: 2026-05-03
updated: 2026-05-10
owner: Product Owner
related_products:
  - Specorator (UI)
  - agentic-workflow
  - agentonomous
  - sim-ecs (third-party simulation kernel, internal-only)
related_issues:
  - "#14 Architecture proposal v1 — agent habitat + sim-ecs + workflow package"
design_principle: "Abstract complexity from the consumer. The runtime exposes a small, easy-to-use surface; ECS, definition parsing, permission enforcement, agent lifecycle, and trace correlation are all internal."
```

> This is the kick-off document for the project. Read it first. Everything else derives from it.

---

## v0.3 reframe note (read first)

**v0.1** modelled the runtime as a workflow interpreter that resolves a DAG.
**v0.2** reframed it as an agent habitat that hosts long-lived agents.
**v0.3** adds the missing piece: agentic-workflow content is fed into the runtime as an **operational workflow package** — agent definitions, skill definitions, command definitions, and TS scripts — that the runtime loads, instantiates, and exposes through a small set of consumer-facing verbs. The runtime is the easy-to-use surface; complexity is internal.

The headline consumer experience:

```ts
const kernel = new RuntimeKernel({ logger });

const session = await kernel.startSession({
  workflow: agenticWorkflowPackage,      // loaded from the user's vault
  capabilities: { llm, vaultRead },      // Specorator-provided
});

kernel.bus.subscribe('*', (event) => cockpitStore.append(event));

// One-call high-level UX:
await kernel.runCommand(session.id, '/spec:requirements', { feature: 'X' });

// Or finer-grained when the UI needs it:
const agent = await kernel.spawnAgent(session.id, { definitionId: 'analyst' });
await kernel.submitTask(session.id, { prompt: '...', toAgent: agent.id });
await kernel.runScript(session.id, 'check-traceability', { feature: 'X' });

await kernel.stopSession(session.id);
```

Architectural detail lives in [issue #14](https://github.com/Luis85/specorator-runtime/issues/14). Sections changed materially in v0.3: §1, §3, §4, §5 (added §5.6 *Workflow Package*), §6 (new FR-09 / FR-10 / FR-11), §7, §9, §10, §13, §14, §15, §16. Sections unchanged from v0.2: §7 NFRs (still hold), §8 SRs (with SR-06 from v0.2). North Star unchanged.

---

## 1. Executive Summary

The **Specorator Runtime** is the missing **agent habitat** in the Specorator ecosystem. It hosts a team of long-lived agentonomous agents, loads an operational **workflow package** sourced from `agentic-workflow` (agents, skills, commands, scripts), and exposes a small, easy-to-use API for the Specorator UI:

- start a session with a workflow package and a capability bundle
- submit task requests, run commands, run scripts, spawn or stop agents
- subscribe to a typed event stream

Internally the runtime instantiates agents from definitions, enforces declared per-agent permissions, drives a deterministic per-tick simulation (sim-ecs), routes commands to the right script and agent, captures outputs as proposed artifacts, and emits trace-correlated events. **All of that is hidden from the consumer.**

The Runtime turns the ecosystem from **static methodology + manual agent use** into a **stateful, observable agent habitat where humans stay in control of every output**.

---

## 2. Problem Statement

Currently:

- The agentic-workflow methodology is documented as Markdown templates and TS scripts, but there is no runtime that can load it as an operational package and bring agents to life from its definitions
- Agents (agentonomous) are invokable but lack a habitat with shared capabilities, dynamic spawning, declared permissions, and a unified event stream
- The agentic-workflow scripts (validation, status checks, automation) are runnable manually but cannot be invoked uniformly from a UI alongside agent tasks
- There is no cohesive "command" abstraction that ties a slash command, a script, and an agent task into one observable run
- There is no observability layer the Obsidian companion UI can subscribe to in real time

> Without the runtime, the ecosystem cannot present itself as a continuous, observable, multi-agent collaboration environment driven by a single workflow package.

---

## 3. Goals

### Primary Goals

- **Load workflow packages** — accept an agentic-workflow-shaped package (agent / skill / command definitions + TS scripts + reference content) and make all of it operational
- **Host a session-scoped agent habitat** — one simulation world per session, with dynamic agent spawning and stopping
- **Provide a small, easy-to-use API** — `startSession`, `runCommand`, `submitTask`, `runScript`, `spawnAgent`, `stopAgent`, `stopSession`, plus event subscription
- **Enforce per-agent permissions** declared in the loaded definitions — agents can only invoke skills / commands / scripts they are allowed to use
- **Provide a capability bundle** (LLM, vault read, logger, clock, RNG, more as needed) supplied at session start
- **Expose a typed event stream** so the Specorator UI can render live state without polling
- **Stay HITL-safe** — runtime emits artifact proposals; never writes to the vault

### Secondary Goals

- Replayability via the append-only event log
- Cross-bus traceability via shared `traceId` (UI ↔ runtime ↔ each agent)
- Extensibility via additive ports (new capabilities, new agent adapters)

---

## 4. Non-Goals (V1)

- Distributed execution
- Multi-user collaboration
- Cloud-native scaling
- Persistence of session state (agentonomous handles its own agent-level snapshots)
- Complex scheduling strategies
- **Workflow interpretation as a DAG** — the runtime does not parse `specs//` into an execution graph
- **Methodology ownership** — no types for stages, parallel tracks, immutability rules, EARS notation, traceability rules
- **Authoring** workflow packages — the runtime *loads* a package; it does not generate or edit one
- **Replacing harness adapters** — Claude Code / Codex / Copilot / editor-agents from agentic-workflow #91 remain valid execution paths
- **Vault writes** — runtime emits artifact proposals; Specorator owns vault I/O after HITL acceptance
- **Script sandboxing** — scripts run as plain Node code in V1; trust is the user's, not the runtime's, responsibility (deferred to Phase 3+)
- **Inventing a permission DSL** — permissions are read from the workflow package's existing definition format
- A UI, a CLI, or a server

---

## 5. Core Concepts

### 5.1 Session — the habitat run

A **Session** represents one habitat run: a simulation world hosting a configured agent population, a loaded workflow package, and a configured capability bundle.

Contains: session id and `traceId`, agent population (initial team + dynamically spawned), loaded workflow package, capability bundle, task inbox, command/script invocations in flight, artifact set, append-only event log.

Sessions are in-memory and ephemeral in V1.

### 5.2 Event-Driven Coordination

The system surfaces every habitat lifecycle change as a typed event on the runtime bus.

V1 event taxonomy: `session.*`, `task.*`, `agent.*`, `artifact.*`, `command.*`, `script.*`, `llm.*`. Examples:

- `session.started`, `session.idle`, `session.stopped`
- `task.received`, `task.claimed`, `task.completed`
- `agent.spawned`, `agent.invoked`, `agent.completed`, `agent.stopped`
- `command.invoked`, `command.completed`
- `script.invoked`, `script.completed`
- `artifact.created`
- `llm.requested`, `llm.responded` — emitted by the `PromptEngineer` for every outbound LLM call

**Three buses, one trace** invariant unchanged: the runtime bus, Specorator's plugin bus, and each agent's internal bus all share a `traceId`.

### 5.3 Task Lifecycle

Each **Task Request** is a UI-issued user request:

```
received → claimed → in-progress → completed | failed
```

Multiple tasks may be in flight across the agent team simultaneously.

### 5.4 Agent Population and Capabilities

Agents are **persistent entities** in the session's simulation world. They:

- are **instantiated from definitions** in the loaded workflow package (not invented by the runtime)
- can be **spawned and stopped dynamically** during a session
- declare what they're **allowed to do** (skills / commands / scripts) — runtime enforces
- tick continuously, claim tasks they're suited to, and use capabilities to do their work

Capabilities (V1 set, supplied at session start, available to all agents):

- **LLM** — provider port shape compatible with agentonomous's `LlmProviderPort`
- **VaultRead** — read-only access to a permission-filtered vault context (Specorator-supplied)
- **Logger / WallClock / Rng** — observability and determinism seams

The capability set is extensible — adding a new capability is additive (no breaking change).

### 5.5 Task Submission

Tasks enter the habitat through `kernel.submitTask(sessionId, { prompt, inputs?, toAgent?, traceId? })`. The runtime never generates tasks on its own; every task comes from the consumer.

### 5.6 Workflow Package (NEW in v0.3)

A **Workflow Package** is the operational content the consumer feeds into the runtime at session start. Its shape mirrors what agentic-workflow already ships:

- **Agent definitions** — typed descriptors loaded from `.claude/agents/*.md` (declared role, allowed skills/commands/scripts, persona, prompt, etc.)
- **Skill definitions** — loaded from `.claude/skills/*.md`
- **Command definitions** — loaded from `.claude/commands/*.md` (slash commands like `/spec:requirements`; each declares its dispatch — which script to run, which agent role to engage, what inputs to collect)
- **Scripts** — TS modules from `scripts/` that the runtime can invoke on demand (validation, status checks, automation)
- **Reference content** — templates (`templates/`), process docs (`docs/`), constitution (`memory/constitution.md`) — read-only material agents consult while working

The package is **passed as data into the runtime**; the runtime does not read the user's filesystem itself. Specorator (or any other consumer) is responsible for assembling the package from the user's vault and handing it to `startSession`. The runtime does not author, edit, or generate definitions — it consumes a package.

The **Workflow Package format is owned by the runtime** as a typed contract. Consumers (Specorator) ship adapters that translate between the on-disk shape (agentic-workflow Markdown + TS files) and the typed package the runtime expects. The runtime pins to a specific agentic-workflow major version through the package version field; bumps are coordinated.

### 5.7 Prompt Engineering and LLM Gatekeeping (NEW in v0.4)

The runtime acts as the **sole gatekeeper for all outbound LLM calls**. Agents never hold raw `LlmCapability`; instead, a `PromptEngineer` component intercepts every LLM call, forms a typed `LLMRequest` from the agent / task context, and wraps the raw provider in a `LlmGateway` interface that agents receive. The gatekeeper:

- **Forms a structured `LLMRequest`** — messages array (system prompt, persona, history, task content), parameters (model, temperature, maxTokens), session / agent / task context, `traceId`
- **Emits `llm.requested`** — the request is observable before it reaches the provider
- **Calls `LlmCapability.complete()`** — the actual LLM call
- **Captures the `LLMResponse`** — content, usage, model, duration
- **Emits `llm.responded`** — the response is observable for tracing, cost attribution, and replay

Agents and scripts interact only through `LlmGateway.complete(requestContext)`. The raw `LlmCapability` never leaves the runtime's internal component.

### 5.8 Request / Response as the Public Contract (NEW in v0.4)

The consumer's mental model is deliberately simple: **every verb is a request; every observable output is a response**.

- `submitTask` → `task.received` → … → `task.completed` + `artifact.created`
- `runCommand` → `command.invoked` → … → `command.completed` + optional artifact
- `runScript` → `script.invoked` → `script.completed` + optional artifact
- `spawnAgent` → `agent.spawned`

The internal complexity — ECS, PromptEngineer, permission guards, tick pipeline — is entirely opaque. A consumer only needs to know: *pass a request in; listen for events and artifacts as the response*.

---

## 6. Functional Requirements

### FR-01: Session Management

- System MUST allow creating sessions with a loaded workflow package, a capability bundle, and an optional initial agent team
- System MUST allow stopping a session and verifying clean teardown (no leaked listeners)
- System MUST allow querying current session state via `getSession`

### FR-02: Task Submission and Inbox

- System MUST accept task requests via `submitTask(sessionId, taskRequest)`
- System MUST place each request in the session's inbox in `received` state
- System MUST emit `task.received` events with monotonic `seq` ordering
- System MUST tolerate any number of concurrent in-flight tasks
- System MUST support optional `toAgent` targeting (skip claim matching, dispatch directly)

### FR-03: Agent Hosting and Capabilities

- System MUST host agents as long-lived entities for the session's lifetime
- System MUST drive each agent's tick pipeline on a configurable cadence
- System MUST match received tasks to agents based on declared agent skills (when no `toAgent` is specified)
- System MUST pass typed `AgentContext` containing inputs and capability handles to claiming agents
- System MUST capture agent outputs as `Artifact` proposals
- System MUST NOT write artifacts to any persistent store

### FR-04: Event System

- System MUST emit typed events for all habitat lifecycle changes
- System MUST allow subscription via `'*'` or by `type`
- System MUST guarantee event ordering within a session via monotonic `seq`
- System MUST isolate per-listener errors in both sync and async dispatch
- System MUST propagate `traceId` on every event

### FR-05: State Management

- System MUST maintain runtime state in-memory only (V1)
- System MUST expose state as a JSON-friendly snapshot via `getSession`
- System MUST track sessions, agents, tasks, command/script invocations, artifacts, and the event log

### FR-06: Simulation Kernel

- System MUST execute a deterministic per-tick pipeline driven by an embedded ECS kernel (sim-ecs)
- System MUST support a step-driven mode for tests and a continuous-loop mode for live sessions
- System MUST NOT expose ECS primitives in the public API surface
- System MUST keep per-tick stages stable and documented

### FR-07: Runtime API

- System MUST expose at minimum:
  - `startSession(opts)` / `stopSession(id)` / `getSession(id)`
  - `submitTask(id, taskRequest)`
  - `runCommand(id, commandId, args)` — high-level abstraction
  - `runScript(id, scriptId, args)` — finer-grained
  - `spawnAgent(id, agentSpec)` / `stopAgent(id, agentId)`
  - event subscription via the bus
- System MUST keep all public types re-exported from a single barrel (SR-03)
- System MUST hide ECS primitives, definition parsing, permission enforcement, and lifecycle internals from the public surface

### FR-08: Observability

- System MUST provide an append-only, JSON-round-trippable event log per session
- System MUST expose task/agent/command/script states and artifact captures in real time
- System MUST enable a UI consumer to render the entire habitat without polling

### FR-09: Workflow Package Loading (NEW in v0.3)

- System MUST accept a workflow package as typed input at `startSession`
- System MUST parse declared agents, skills, commands, scripts, and reference content
- System MUST validate the package on load and surface validation errors as a typed `Result`
- System MUST NOT read the user's filesystem directly — the consumer supplies the package as data
- System MUST NOT modify or generate package contents

### FR-10: Command Dispatch (NEW in v0.3)

- System MUST expose `runCommand(sessionId, commandId, args)` as the high-level dispatch verb
- System MUST resolve a command's declared handler from the loaded package (script, agent task, or both)
- System MUST emit `command.invoked` and `command.completed` events bracketing the run, with results captured as artifacts where applicable
- System MUST enforce per-agent permission when a command engages an agent — disallowed dispatch returns a typed `PermissionDeniedError` Result
- System MUST allow the same command to be invoked concurrently with different args

### FR-11: Script Execution (NEW in v0.3)

- System MUST expose `runScript(sessionId, scriptId, args)` for finer-grained script invocation
- System MUST execute scripts as plain Node modules in V1 — no sandbox
- System MUST emit `script.invoked` and `script.completed` events with structured results
- System MUST capture script output as an `Artifact` when the script declares it produces one
- System MUST NOT itself write script results to the vault — output flows through events for HITL

### FR-12: Prompt Engineering and LLM Gatekeeping (NEW in v0.4)

- System MUST gate all outbound LLM calls through a `PromptEngineer` component
- System MUST form a typed `LLMRequest` before every LLM call, capturing sessionId, agentId, taskId, messages, parameters, and traceId
- System MUST capture a typed `LLMResponse` for every call, including content, model, usage, and durationMs
- System MUST emit `llm.requested` and `llm.responded` events for every LLM interaction
- System MUST expose `LlmGateway` (not raw `LlmCapability`) to agents and scripts
- System MUST NOT allow agents or scripts to call the LLM provider directly

---

## 7. Non-Functional Requirements

(unchanged from v0.2; NFR-01 through NFR-06 still hold)

---

## 8. Solution Requirements

(unchanged from v0.2; SR-01 through SR-06 still hold. SR-06 *Cross-package shape parity* added in v0.2 is retained.)

---

## 9. Architecture Overview

| Component | Responsibility |
|---|---|
| **Runtime Kernel** | The verb-bearing public façade — `startSession`, `submitTask`, `runCommand`, `runScript`, `spawnAgent`, `stopAgent`, `stopSession`, `getSession` |
| **Workflow Package Loader** | Validates and registers agent / skill / command / script definitions and reference content from a supplied package |
| **Command Registry** | Resolves a command id to its declared dispatch (script + agent role + inputs) and orchestrates execution |
| **Script Runner** | Invokes scripts from the loaded package as plain Node modules; captures structured results |
| **Agent Adapter** | Bridges agentonomous (and other) agents into ECS entity bundles with declared skill/permission components; runs `agent.tick()` autonomously and `agent.receive()` on task claim |
| **Permission Guard** | Enforces declared per-agent permissions at command/skill/script dispatch |
| **Simulation Kernel** | sim-ecs-backed per-tick scheduler; runs `intake → claim → invoke → capture → emit` stages; not exposed in public API |
| **Prompt Engineer** | Intercepts every outbound LLM call; forms a typed `LLMRequest`; emits `llm.requested` / `llm.responded`; wraps raw `LlmCapability` as `LlmGateway` exposed to agents |
| **Capability Provider** | Hosts capability ports (LLM, VaultRead, Logger, Clock, Rng, ...) as world-global resources |
| **Task Inbox** | Holds received task requests until claimed |
| **Event Bus** | Pub/sub for out-of-world observers; trace-correlated with Specorator's bus and each agent's bus |
| **State Snapshot** | JSON-friendly read of the world for `getSession` |

> v0.3 adds the Loader, Command Registry, Script Runner, and Permission Guard. v0.4 adds the Prompt Engineer. v0.2 retired the literal "Workflow Interpreter" component (no DAG to parse).

---

## 10. Example Flows

### 10.1 Task flow

```
UI: kernel.submitTask(sessionId, { prompt: "draft requirements for feature X", traceId })
  → task.received     (task entity created in the inbox)
    → task.claimed    (analyst agent matches the request to its declared skills)
      → agent.invoked
        → (agent uses LLM + reference content from the package)
        → agent.completed
          → artifact.created (proposal captured)
            → task.completed
```

### 10.2 Command flow (NEW in v0.3)

```
UI: kernel.runCommand(sessionId, '/spec:requirements', { feature: 'X' })
  → command.invoked   (command resolved from the loaded package)
    → script.invoked  (declared dispatch runs scripts/check-feature-readiness.ts)
    → script.completed
    → task.received   (declared dispatch creates an analyst-targeted task)
    → task.claimed
      → agent.invoked
        → agent.completed
          → artifact.created
            → task.completed
  → command.completed
```

### 10.3 Direct script flow

```
UI: kernel.runScript(sessionId, 'check-traceability', { feature: 'X' })
  → script.invoked
  → script.completed   (structured result on the event payload; optional Artifact)
```

In every flow, **artifacts are proposals**. Specorator (not the runtime) decides whether and how to apply them to the vault.

---

## 11. Integration Points

### Specorator (UI)

- Loads the agentic-workflow package from the user's vault and assembles the typed Workflow Package for `startSession`
- Calls `runCommand` / `submitTask` / `runScript` / `spawnAgent` / `stopAgent` from the cockpit UI
- Subscribes to runtime events for cockpit rendering
- Owns vault I/O after HITL acceptance
- Supplies the `VaultRead` capability bridge (permission-filtered)

### agentic-workflow

- Source of the workflow package's contents — agent / skill / command definitions, scripts, templates, docs
- Consumed as data through Specorator's package adapter
- NOT executed by the runtime as a DAG

### agentonomous

- Source of agent shape, skills, ticking, persona, lifecycle, snapshot, LLM provider port
- Hosted by the runtime as ECS entities via a thin adapter
- Instantiated from agent definitions in the loaded workflow package

### sim-ecs (third-party, internal-only)

- Embedded simulation kernel
- Not exposed in the public API surface
- Version-coordinated with agentonomous's optional peer dep range

---

## 12. UX Entry Points

### 12.1 Start Session

User opens a workflow context in Specorator → Specorator assembles the Workflow Package and calls `startSession`

### 12.2 Run Command (high-level)

User invokes a slash command from the cockpit (e.g., `/spec:requirements`) → Specorator calls `runCommand` → runtime orchestrates script + agent task → events stream back

### 12.3 Submit Task / Run Script (finer-grained)

For scenarios that don't map to a single declared command, the UI uses `submitTask` (free-form prompt) or `runScript` (deterministic operation)

### 12.4 Spawn / Stop Agent

The user (or the cockpit, programmatically) brings agents into and out of the session as the work progresses

### 12.5 Inspect and Decide (HITL)

User reviews proposed artifacts → accepts (Specorator writes to vault), edits, refines (resubmits), or rejects

### 12.6 Inspect Results

User browses past sessions: artifacts, command/script logs, full execution trace

---

## 13. Acceptance Criteria

- [ ] A session can be started with a loaded Workflow Package, capability bundle, and optional initial agent team
- [ ] Tasks can be submitted via `submitTask` and reach `task.received` state
- [ ] Commands can be invoked via `runCommand` and resolve to their declared dispatch (script + agent task)
- [ ] Scripts can be invoked via `runScript` and emit `script.completed` with structured results
- [ ] Agents can be spawned via `spawnAgent` and stopped via `stopAgent` mid-session
- [ ] Per-agent permissions are enforced — disallowed dispatch returns `PermissionDeniedError`
- [ ] Workflow package validation fails fast with a typed `Result` for malformed packages
- [ ] Events are emitted for every lifecycle transition with monotonic `seq` and shared `traceId`
- [ ] UI can subscribe and render execution
- [ ] Runtime state is queryable
- [ ] Per-listener error isolation holds in both sync and async dispatch
- [ ] `npm install specorator-runtime` installs the package and its TypeScript types
- [ ] `import { RuntimeKernel } from 'specorator-runtime'` works without deep or path-based imports
- [ ] `npm run verify` passes with zero failures and zero `any` or `ts-ignore` in production code
- [ ] All public API members have JSDoc documentation
- [ ] All architectural decisions are recorded as ADRs following the `agentic-workflow` template
- [ ] Cross-package shape parity (SR-06) verified via CI assignability test
- [ ] Runtime never writes to the vault under any code path — invariant verified by test
- [ ] Public API surface contains no ECS primitives, no definition-parser internals, no permission-enforcement internals — verified by surface review

---

## 14. Risks

(refreshed for v0.3)

- Over-engineering too early
- Tight coupling between runtime and UI
- Unclear event schema (mitigated by typed discriminated union)
- Uncontrolled async behavior (mitigated by per-listener error isolation)
- npm module format incompatibility with Obsidian (mitigated — Vite handles ESM→CJS)
- Public API surface instability
- PRD §9 reinterpretation contested (carried from v0.2)
- sim-ecs version drift with agentonomous (carried from v0.2)
- HITL invariant violation (carried from v0.2)
- **Workflow package format coupling** — if agentic-workflow's `.claude/agents/` Markdown shape changes substantively, Specorator's package adapter (not the runtime) absorbs the change. Mitigated by versioning the package format. (NEW in v0.3)
- **Permission semantics drift** — if the agent definition Markdown evolves new permission concepts, the runtime's Permission Guard must follow. Mitigated by declaring the runtime's understood permission shape in the package contract. (NEW in v0.3)
- **Untrusted scripts** — scripts run as plain Node in V1. If the runtime is ever embedded in a multi-user / untrusted context, sandboxing becomes mandatory (Phase 3+ concern). (NEW in v0.3)
- **API minimalism vs power** — `runCommand` may be too coarse for some UI flows. Mitigated by also exposing `submitTask` / `runScript` / `spawnAgent` for finer control. (NEW in v0.3)

---

## 15. Delivery Plan (V1)

| Phase | Contents |
|---|---|
| **Phase 1 — Core Skeleton** | Runtime kernel; embedded sim-ecs world; event bus; session model; capability ports (LLM, VaultRead, Logger, Clock, Rng); workflow package loader (validation + registration); `PromptEngineer` + `LlmGateway` (LLM gatekeeper); stub agent adapter; `startSession` / `stopSession` / `submitTask` / `getSession` |
| **Phase 2 — Live Agents and Commands** | Agentonomous adapter; full task→claim→tick→artifact pipeline; dynamic `spawnAgent` / `stopAgent`; `runCommand` / `runScript` with declared dispatch; permission enforcement; `task.failed` / `agent.failed` / `command.failed` lifecycles |
| **Phase 3 — Observability** | Trace correlation, structured logging, `kernel.replay(eventLog)`, listener-leak tripwire, session snapshot ergonomics |
| **Phase 4 — Integration** | Connect to Specorator v2.0 and the actual agentic-workflow package adapter Specorator ships; production-ready capability adapters; co-test the full ecosystem story |

---

## 16. Open Questions

(most resolved by issue #14; remaining open items below)

Resolved by #14:

- ~~Should sessions be persisted to filesystem?~~ → No; agentonomous handles its own snapshots
- ~~How to model retries and failures?~~ → Reserved event types; Phase 2
- ~~How to represent artifacts (files vs memory)?~~ → In-memory proposals only; Specorator owns vault writes
- ~~How strict should event schemas be?~~ → Typed discriminated union; JSON-round-trippable
- ~~Module format?~~ → ESM-only
- ~~Error handling strategy?~~ → `Result` at the public API boundary; shape parity with agentonomous + Specorator #104
- ~~How does the runtime consume agentic-workflow?~~ → As a typed Workflow Package (operational definitions + scripts + reference content), assembled by the consumer (Specorator) and passed in at `startSession`
- ~~How are permissions modelled?~~ → Read from the existing agent definition Markdown; runtime enforces at dispatch
- ~~Are scripts sandboxed?~~ → No (V1); trusted Node execution; sandbox is a Phase 3+ concern

Still open (handed to Workshop A in #4):

- **Orchestrator Engine scope** — the runtime owns and manages an orchestrator engine that tracks tasks across all running autonomous agents. Spec in design; will be attached as a dedicated issue. The orchestrator engine is in-scope for the runtime but not yet specified.
- The exact typed shape of the Workflow Package contract (agent definition fields, command dispatch declaration, etc.)
- Cross-bus `traceId` lifecycle confirmation
- Complete capability surface for V1 — LLM / VaultRead / Logger / Clock / Rng sufficient, or does the agentic-workflow content imply we need more (e.g., `WebFetch`, `MemoryStore`)?
- agentic-workflow major version pinning for V1
- sim-ecs version coordination with agentonomous
- Cockpit subscription pattern — one bus or two?

---

## Critical North Star

This runtime must remain a **knowledge-work agent habitat** — not a generic agent platform, not a workflow engine. Every decision should be tested against:

> "Does this help humans understand, control, and evolve knowledge work?"

---

## Path Forward

| Step | Issue | What it produces |
|---|---|---|
| 1 | #2 — Product presence | README, VISION.md, product page (delivered in PR #13) |
| 2 | #3 — Project environment | Repo skeleton, CI, labels, workshops |
| 2b | **#14 — Architecture proposal** | Detailed agent-habitat architecture (v0.7 reflects v0.3 PRD) |
| 3 | #4 — Design consolidation | Solution proposal, enriched docs, ADRs (consumes #14) |
| 4 | #5 — Baseline v0.0.1 | Frozen pre-engineering document release |
| 5 | #6 — Formal initiation | p3.express Group A artifacts, v0.0.2 |

Steps 1, 2, 2b can run in parallel. Steps 3–5 are sequential.

---

## Changelog

- **v0.4** (2026-05-10): Added §5.7 Prompt Engineering and LLM Gatekeeping — runtime is the sole gatekeeper for all outbound LLM calls; `PromptEngineer` forms `LLMRequest`, emits `llm.requested` / `llm.responded`, exposes `LlmGateway` to agents. Added §5.8 Request / Response as the Public Contract mental model. Added FR-12 (Prompt Engineering and LLM Gatekeeping). Added `Prompt Engineer` row to §9 Architecture Overview. Updated §5.2 event taxonomy to include `llm.*`. Added `PromptEngineer` + `LlmGateway` to Phase 1 in §15. Added Orchestrator Engine open question to §16. Meta: version 0.4, updated 2026-05-10.
- **v0.3** (2026-05-04): Added the **Workflow Package** concept — agentic-workflow content is loaded as an operational package (agent / skill / command definitions + TS scripts + reference content), not just consulted as reference. Added FR-09 (package loading), FR-10 (command dispatch), FR-11 (script execution). Expanded §7 Runtime API with `runCommand`, `runScript`, `spawnAgent`, `stopAgent`. Added §9 components: Workflow Package Loader, Command Registry, Script Runner, Permission Guard. Added §10.2 Command flow and §10.3 Direct script flow. Permissions clarified: read from existing agent definition Markdown — no new DSL. Scripts trusted in V1 — no sandbox. New design principle codified in the meta block: *abstract complexity from the consumer*. Acceptance criteria expanded to include public-API surface review.
- **v0.2** (2026-05-04): Reframed runtime as **agent habitat**, not workflow interpreter. Reshaped §1, §3, §4, §5, §6, §9, §10. Added SR-06 cross-package shape parity. Detailed in [issue #14](https://github.com/Luis85/specorator-runtime/issues/14).
- **v0.1** (2026-05-03): Initial PRD.

Component	Responsibility
Runtime Kernel	The verb-bearing public façade — `startSession`, `submitTask`, `runCommand`, `runScript`, `spawnAgent`, `stopAgent`, `stopSession`, `getSession`
Workflow Package Loader	Validates and registers agent / skill / command / script definitions and reference content from a supplied package
Command Registry	Resolves a command id to its declared dispatch (script + agent role + inputs) and orchestrates execution
Script Runner	Invokes scripts from the loaded package as plain Node modules; captures structured results
Agent Adapter	Bridges agentonomous (and other) agents into ECS entity bundles with declared skill/permission components; runs `agent.tick()` autonomously and `agent.receive()` on task claim
Permission Guard	Enforces declared per-agent permissions at command/skill/script dispatch
Simulation Kernel	sim-ecs-backed per-tick scheduler; runs `intake → claim → invoke → capture → emit` stages; not exposed in public API
Prompt Engineer	Intercepts every outbound LLM call; forms a typed `LLMRequest`; emits `llm.requested` / `llm.responded`; wraps raw `LlmCapability` as `LlmGateway` exposed to agents
Capability Provider	Hosts capability ports (LLM, VaultRead, Logger, Clock, Rng, ...) as world-global resources
Task Inbox	Holds received task requests until claimed
Event Bus	Pub/sub for out-of-world observers; trace-correlated with Specorator's bus and each agent's bus
State Snapshot	JSON-friendly read of the world for `getSession`

Phase	Contents
Phase 1 — Core Skeleton	Runtime kernel; embedded sim-ecs world; event bus; session model; capability ports (LLM, VaultRead, Logger, Clock, Rng); workflow package loader (validation + registration); `PromptEngineer` + `LlmGateway` (LLM gatekeeper); stub agent adapter; `startSession` / `stopSession` / `submitTask` / `getSession`
Phase 2 — Live Agents and Commands	Agentonomous adapter; full task→claim→tick→artifact pipeline; dynamic `spawnAgent` / `stopAgent`; `runCommand` / `runScript` with declared dispatch; permission enforcement; `task.failed` / `agent.failed` / `command.failed` lifecycles
Phase 3 — Observability	Trace correlation, structured logging, `kernel.replay(eventLog)`, listener-leak tripwire, session snapshot ergonomics
Phase 4 — Integration	Connect to Specorator v2.0 and the actual agentic-workflow package adapter Specorator ships; production-ready capability adapters; co-test the full ecosystem story

Step	Issue	What it produces
1	#2 — Product presence	README, VISION.md, product page (delivered in PR #13)
2	#3 — Project environment	Repo skeleton, CI, labels, workshops
2b	#14 — Architecture proposal	Detailed agent-habitat architecture (v0.7 reflects v0.3 PRD)
3	#4 — Design consolidation	Solution proposal, enriched docs, ADRs (consumes #14)
4	#5 — Baseline v0.0.1	Frozen pre-engineering document release
5	#6 — Formal initiation	p3.express Group A artifacts, v0.0.2

Phase 0 · PRD — Specorator Runtime #1

Description

Meta

v0.3 reframe note (read first)

1. Executive Summary

2. Problem Statement

3. Goals

Primary Goals

Secondary Goals

4. Non-Goals (V1)

5. Core Concepts

5.1 Session — the habitat run

5.2 Event-Driven Coordination

5.3 Task Lifecycle

5.4 Agent Population and Capabilities

5.5 Task Submission

5.6 Workflow Package (NEW in v0.3)

5.7 Prompt Engineering and LLM Gatekeeping (NEW in v0.4)

5.8 Request / Response as the Public Contract (NEW in v0.4)

6. Functional Requirements

FR-01: Session Management

FR-02: Task Submission and Inbox

FR-03: Agent Hosting and Capabilities

FR-04: Event System

FR-05: State Management

FR-06: Simulation Kernel

FR-07: Runtime API

FR-08: Observability

FR-09: Workflow Package Loading (NEW in v0.3)

FR-10: Command Dispatch (NEW in v0.3)

FR-11: Script Execution (NEW in v0.3)

FR-12: Prompt Engineering and LLM Gatekeeping (NEW in v0.4)

7. Non-Functional Requirements

8. Solution Requirements

9. Architecture Overview

10. Example Flows

10.1 Task flow

10.2 Command flow (NEW in v0.3)

10.3 Direct script flow

11. Integration Points

Specorator (UI)

agentic-workflow

agentonomous

sim-ecs (third-party, internal-only)

12. UX Entry Points

12.1 Start Session

12.2 Run Command (high-level)

12.3 Submit Task / Run Script (finer-grained)

12.4 Spawn / Stop Agent

12.5 Inspect and Decide (HITL)

12.6 Inspect Results

13. Acceptance Criteria

14. Risks

15. Delivery Plan (V1)

16. Open Questions

Critical North Star

Path Forward

Changelog

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions