Skip to content

Phase 0 · PRD — Specorator Runtime #1

@Luis85

Description

@Luis85

Meta

domain: Specorator Runtime
type: ProductRequirementsDocument
stage: revised
version: 0.4
maturity: L1
created: 2026-05-03
updated: 2026-05-10
owner: Product Owner
related_products:
  - Specorator (UI)
  - agentic-workflow
  - agentonomous
  - sim-ecs (third-party simulation kernel, internal-only)
related_issues:
  - "#14 Architecture proposal v1 — agent habitat + sim-ecs + workflow package"
design_principle: "Abstract complexity from the consumer. The runtime exposes a small, easy-to-use surface; ECS, definition parsing, permission enforcement, agent lifecycle, and trace correlation are all internal."

This is the kick-off document for the project. Read it first. Everything else derives from it.


v0.3 reframe note (read first)

v0.1 modelled the runtime as a workflow interpreter that resolves a DAG.
v0.2 reframed it as an agent habitat that hosts long-lived agents.
v0.3 adds the missing piece: agentic-workflow content is fed into the runtime as an operational workflow package — agent definitions, skill definitions, command definitions, and TS scripts — that the runtime loads, instantiates, and exposes through a small set of consumer-facing verbs. The runtime is the easy-to-use surface; complexity is internal.

The headline consumer experience:

const kernel = new RuntimeKernel({ logger });

const session = await kernel.startSession({
  workflow: agenticWorkflowPackage,      // loaded from the user's vault
  capabilities: { llm, vaultRead },      // Specorator-provided
});

kernel.bus.subscribe('*', (event) => cockpitStore.append(event));

// One-call high-level UX:
await kernel.runCommand(session.id, '/spec:requirements', { feature: 'X' });

// Or finer-grained when the UI needs it:
const agent = await kernel.spawnAgent(session.id, { definitionId: 'analyst' });
await kernel.submitTask(session.id, { prompt: '...', toAgent: agent.id });
await kernel.runScript(session.id, 'check-traceability', { feature: 'X' });

await kernel.stopSession(session.id);

Architectural detail lives in issue #14. Sections changed materially in v0.3: §1, §3, §4, §5 (added §5.6 Workflow Package), §6 (new FR-09 / FR-10 / FR-11), §7, §9, §10, §13, §14, §15, §16. Sections unchanged from v0.2: §7 NFRs (still hold), §8 SRs (with SR-06 from v0.2). North Star unchanged.


1. Executive Summary

The Specorator Runtime is the missing agent habitat in the Specorator ecosystem. It hosts a team of long-lived agentonomous agents, loads an operational workflow package sourced from agentic-workflow (agents, skills, commands, scripts), and exposes a small, easy-to-use API for the Specorator UI:

  • start a session with a workflow package and a capability bundle
  • submit task requests, run commands, run scripts, spawn or stop agents
  • subscribe to a typed event stream

Internally the runtime instantiates agents from definitions, enforces declared per-agent permissions, drives a deterministic per-tick simulation (sim-ecs), routes commands to the right script and agent, captures outputs as proposed artifacts, and emits trace-correlated events. All of that is hidden from the consumer.

The Runtime turns the ecosystem from static methodology + manual agent use into a stateful, observable agent habitat where humans stay in control of every output.


2. Problem Statement

Currently:

  • The agentic-workflow methodology is documented as Markdown templates and TS scripts, but there is no runtime that can load it as an operational package and bring agents to life from its definitions
  • Agents (agentonomous) are invokable but lack a habitat with shared capabilities, dynamic spawning, declared permissions, and a unified event stream
  • The agentic-workflow scripts (validation, status checks, automation) are runnable manually but cannot be invoked uniformly from a UI alongside agent tasks
  • There is no cohesive "command" abstraction that ties a slash command, a script, and an agent task into one observable run
  • There is no observability layer the Obsidian companion UI can subscribe to in real time

Without the runtime, the ecosystem cannot present itself as a continuous, observable, multi-agent collaboration environment driven by a single workflow package.


3. Goals

Primary Goals

  • Load workflow packages — accept an agentic-workflow-shaped package (agent / skill / command definitions + TS scripts + reference content) and make all of it operational
  • Host a session-scoped agent habitat — one simulation world per session, with dynamic agent spawning and stopping
  • Provide a small, easy-to-use APIstartSession, runCommand, submitTask, runScript, spawnAgent, stopAgent, stopSession, plus event subscription
  • Enforce per-agent permissions declared in the loaded definitions — agents can only invoke skills / commands / scripts they are allowed to use
  • Provide a capability bundle (LLM, vault read, logger, clock, RNG, more as needed) supplied at session start
  • Expose a typed event stream so the Specorator UI can render live state without polling
  • Stay HITL-safe — runtime emits artifact proposals; never writes to the vault

Secondary Goals

  • Replayability via the append-only event log
  • Cross-bus traceability via shared traceId (UI ↔ runtime ↔ each agent)
  • Extensibility via additive ports (new capabilities, new agent adapters)

4. Non-Goals (V1)

  • Distributed execution
  • Multi-user collaboration
  • Cloud-native scaling
  • Persistence of session state (agentonomous handles its own agent-level snapshots)
  • Complex scheduling strategies
  • Workflow interpretation as a DAG — the runtime does not parse specs// into an execution graph
  • Methodology ownership — no types for stages, parallel tracks, immutability rules, EARS notation, traceability rules
  • Authoring workflow packages — the runtime loads a package; it does not generate or edit one
  • Replacing harness adapters — Claude Code / Codex / Copilot / editor-agents from agentic-workflow #91 remain valid execution paths
  • Vault writes — runtime emits artifact proposals; Specorator owns vault I/O after HITL acceptance
  • Script sandboxing — scripts run as plain Node code in V1; trust is the user's, not the runtime's, responsibility (deferred to Phase 3+)
  • Inventing a permission DSL — permissions are read from the workflow package's existing definition format
  • A UI, a CLI, or a server

5. Core Concepts

5.1 Session — the habitat run

A Session represents one habitat run: a simulation world hosting a configured agent population, a loaded workflow package, and a configured capability bundle.

Contains: session id and traceId, agent population (initial team + dynamically spawned), loaded workflow package, capability bundle, task inbox, command/script invocations in flight, artifact set, append-only event log.

Sessions are in-memory and ephemeral in V1.

5.2 Event-Driven Coordination

The system surfaces every habitat lifecycle change as a typed event on the runtime bus.

V1 event taxonomy: session.*, task.*, agent.*, artifact.*, command.*, script.*, llm.*. Examples:

  • session.started, session.idle, session.stopped
  • task.received, task.claimed, task.completed
  • agent.spawned, agent.invoked, agent.completed, agent.stopped
  • command.invoked, command.completed
  • script.invoked, script.completed
  • artifact.created
  • llm.requested, llm.responded — emitted by the PromptEngineer for every outbound LLM call

Three buses, one trace invariant unchanged: the runtime bus, Specorator's plugin bus, and each agent's internal bus all share a traceId.

5.3 Task Lifecycle

Each Task Request is a UI-issued user request:

received → claimed → in-progress → completed | failed

Multiple tasks may be in flight across the agent team simultaneously.

5.4 Agent Population and Capabilities

Agents are persistent entities in the session's simulation world. They:

  • are instantiated from definitions in the loaded workflow package (not invented by the runtime)
  • can be spawned and stopped dynamically during a session
  • declare what they're allowed to do (skills / commands / scripts) — runtime enforces
  • tick continuously, claim tasks they're suited to, and use capabilities to do their work

Capabilities (V1 set, supplied at session start, available to all agents):

  • LLM — provider port shape compatible with agentonomous's LlmProviderPort
  • VaultRead — read-only access to a permission-filtered vault context (Specorator-supplied)
  • Logger / WallClock / Rng — observability and determinism seams

The capability set is extensible — adding a new capability is additive (no breaking change).

5.5 Task Submission

Tasks enter the habitat through kernel.submitTask(sessionId, { prompt, inputs?, toAgent?, traceId? }). The runtime never generates tasks on its own; every task comes from the consumer.

5.6 Workflow Package (NEW in v0.3)

A Workflow Package is the operational content the consumer feeds into the runtime at session start. Its shape mirrors what agentic-workflow already ships:

  • Agent definitions — typed descriptors loaded from .claude/agents/*.md (declared role, allowed skills/commands/scripts, persona, prompt, etc.)
  • Skill definitions — loaded from .claude/skills/*.md
  • Command definitions — loaded from .claude/commands/*.md (slash commands like /spec:requirements; each declares its dispatch — which script to run, which agent role to engage, what inputs to collect)
  • Scripts — TS modules from scripts/ that the runtime can invoke on demand (validation, status checks, automation)
  • Reference content — templates (templates/), process docs (docs/), constitution (memory/constitution.md) — read-only material agents consult while working

The package is passed as data into the runtime; the runtime does not read the user's filesystem itself. Specorator (or any other consumer) is responsible for assembling the package from the user's vault and handing it to startSession. The runtime does not author, edit, or generate definitions — it consumes a package.

The Workflow Package format is owned by the runtime as a typed contract. Consumers (Specorator) ship adapters that translate between the on-disk shape (agentic-workflow Markdown + TS files) and the typed package the runtime expects. The runtime pins to a specific agentic-workflow major version through the package version field; bumps are coordinated.

5.7 Prompt Engineering and LLM Gatekeeping (NEW in v0.4)

The runtime acts as the sole gatekeeper for all outbound LLM calls. Agents never hold raw LlmCapability; instead, a PromptEngineer component intercepts every LLM call, forms a typed LLMRequest from the agent / task context, and wraps the raw provider in a LlmGateway interface that agents receive. The gatekeeper:

  • Forms a structured LLMRequest — messages array (system prompt, persona, history, task content), parameters (model, temperature, maxTokens), session / agent / task context, traceId
  • Emits llm.requested — the request is observable before it reaches the provider
  • Calls LlmCapability.complete() — the actual LLM call
  • Captures the LLMResponse — content, usage, model, duration
  • Emits llm.responded — the response is observable for tracing, cost attribution, and replay

Agents and scripts interact only through LlmGateway.complete(requestContext). The raw LlmCapability never leaves the runtime's internal component.

5.8 Request / Response as the Public Contract (NEW in v0.4)

The consumer's mental model is deliberately simple: every verb is a request; every observable output is a response.

  • submitTasktask.received → … → task.completed + artifact.created
  • runCommandcommand.invoked → … → command.completed + optional artifact
  • runScriptscript.invokedscript.completed + optional artifact
  • spawnAgentagent.spawned

The internal complexity — ECS, PromptEngineer, permission guards, tick pipeline — is entirely opaque. A consumer only needs to know: pass a request in; listen for events and artifacts as the response.


6. Functional Requirements

FR-01: Session Management

  • System MUST allow creating sessions with a loaded workflow package, a capability bundle, and an optional initial agent team
  • System MUST allow stopping a session and verifying clean teardown (no leaked listeners)
  • System MUST allow querying current session state via getSession

FR-02: Task Submission and Inbox

  • System MUST accept task requests via submitTask(sessionId, taskRequest)
  • System MUST place each request in the session's inbox in received state
  • System MUST emit task.received events with monotonic seq ordering
  • System MUST tolerate any number of concurrent in-flight tasks
  • System MUST support optional toAgent targeting (skip claim matching, dispatch directly)

FR-03: Agent Hosting and Capabilities

  • System MUST host agents as long-lived entities for the session's lifetime
  • System MUST drive each agent's tick pipeline on a configurable cadence
  • System MUST match received tasks to agents based on declared agent skills (when no toAgent is specified)
  • System MUST pass typed AgentContext containing inputs and capability handles to claiming agents
  • System MUST capture agent outputs as Artifact proposals
  • System MUST NOT write artifacts to any persistent store

FR-04: Event System

  • System MUST emit typed events for all habitat lifecycle changes
  • System MUST allow subscription via '*' or by type
  • System MUST guarantee event ordering within a session via monotonic seq
  • System MUST isolate per-listener errors in both sync and async dispatch
  • System MUST propagate traceId on every event

FR-05: State Management

  • System MUST maintain runtime state in-memory only (V1)
  • System MUST expose state as a JSON-friendly snapshot via getSession
  • System MUST track sessions, agents, tasks, command/script invocations, artifacts, and the event log

FR-06: Simulation Kernel

  • System MUST execute a deterministic per-tick pipeline driven by an embedded ECS kernel (sim-ecs)
  • System MUST support a step-driven mode for tests and a continuous-loop mode for live sessions
  • System MUST NOT expose ECS primitives in the public API surface
  • System MUST keep per-tick stages stable and documented

FR-07: Runtime API

  • System MUST expose at minimum:
    • startSession(opts) / stopSession(id) / getSession(id)
    • submitTask(id, taskRequest)
    • runCommand(id, commandId, args) — high-level abstraction
    • runScript(id, scriptId, args) — finer-grained
    • spawnAgent(id, agentSpec) / stopAgent(id, agentId)
    • event subscription via the bus
  • System MUST keep all public types re-exported from a single barrel (SR-03)
  • System MUST hide ECS primitives, definition parsing, permission enforcement, and lifecycle internals from the public surface

FR-08: Observability

  • System MUST provide an append-only, JSON-round-trippable event log per session
  • System MUST expose task/agent/command/script states and artifact captures in real time
  • System MUST enable a UI consumer to render the entire habitat without polling

FR-09: Workflow Package Loading (NEW in v0.3)

  • System MUST accept a workflow package as typed input at startSession
  • System MUST parse declared agents, skills, commands, scripts, and reference content
  • System MUST validate the package on load and surface validation errors as a typed Result
  • System MUST NOT read the user's filesystem directly — the consumer supplies the package as data
  • System MUST NOT modify or generate package contents

FR-10: Command Dispatch (NEW in v0.3)

  • System MUST expose runCommand(sessionId, commandId, args) as the high-level dispatch verb
  • System MUST resolve a command's declared handler from the loaded package (script, agent task, or both)
  • System MUST emit command.invoked and command.completed events bracketing the run, with results captured as artifacts where applicable
  • System MUST enforce per-agent permission when a command engages an agent — disallowed dispatch returns a typed PermissionDeniedError Result
  • System MUST allow the same command to be invoked concurrently with different args

FR-11: Script Execution (NEW in v0.3)

  • System MUST expose runScript(sessionId, scriptId, args) for finer-grained script invocation
  • System MUST execute scripts as plain Node modules in V1 — no sandbox
  • System MUST emit script.invoked and script.completed events with structured results
  • System MUST capture script output as an Artifact when the script declares it produces one
  • System MUST NOT itself write script results to the vault — output flows through events for HITL

FR-12: Prompt Engineering and LLM Gatekeeping (NEW in v0.4)

  • System MUST gate all outbound LLM calls through a PromptEngineer component
  • System MUST form a typed LLMRequest before every LLM call, capturing sessionId, agentId, taskId, messages, parameters, and traceId
  • System MUST capture a typed LLMResponse for every call, including content, model, usage, and durationMs
  • System MUST emit llm.requested and llm.responded events for every LLM interaction
  • System MUST expose LlmGateway (not raw LlmCapability) to agents and scripts
  • System MUST NOT allow agents or scripts to call the LLM provider directly

7. Non-Functional Requirements

(unchanged from v0.2; NFR-01 through NFR-06 still hold)


8. Solution Requirements

(unchanged from v0.2; SR-01 through SR-06 still hold. SR-06 Cross-package shape parity added in v0.2 is retained.)


9. Architecture Overview

Component Responsibility
Runtime Kernel The verb-bearing public façade — startSession, submitTask, runCommand, runScript, spawnAgent, stopAgent, stopSession, getSession
Workflow Package Loader Validates and registers agent / skill / command / script definitions and reference content from a supplied package
Command Registry Resolves a command id to its declared dispatch (script + agent role + inputs) and orchestrates execution
Script Runner Invokes scripts from the loaded package as plain Node modules; captures structured results
Agent Adapter Bridges agentonomous (and other) agents into ECS entity bundles with declared skill/permission components; runs agent.tick() autonomously and agent.receive() on task claim
Permission Guard Enforces declared per-agent permissions at command/skill/script dispatch
Simulation Kernel sim-ecs-backed per-tick scheduler; runs intake → claim → invoke → capture → emit stages; not exposed in public API
Prompt Engineer Intercepts every outbound LLM call; forms a typed LLMRequest; emits llm.requested / llm.responded; wraps raw LlmCapability as LlmGateway exposed to agents
Capability Provider Hosts capability ports (LLM, VaultRead, Logger, Clock, Rng, ...) as world-global resources
Task Inbox Holds received task requests until claimed
Event Bus Pub/sub for out-of-world observers; trace-correlated with Specorator's bus and each agent's bus
State Snapshot JSON-friendly read of the world for getSession

v0.3 adds the Loader, Command Registry, Script Runner, and Permission Guard. v0.4 adds the Prompt Engineer. v0.2 retired the literal "Workflow Interpreter" component (no DAG to parse).


10. Example Flows

10.1 Task flow

UI: kernel.submitTask(sessionId, { prompt: "draft requirements for feature X", traceId })
  → task.received     (task entity created in the inbox)
    → task.claimed    (analyst agent matches the request to its declared skills)
      → agent.invoked
        → (agent uses LLM + reference content from the package)
        → agent.completed
          → artifact.created (proposal captured)
            → task.completed

10.2 Command flow (NEW in v0.3)

UI: kernel.runCommand(sessionId, '/spec:requirements', { feature: 'X' })
  → command.invoked   (command resolved from the loaded package)
    → script.invoked  (declared dispatch runs scripts/check-feature-readiness.ts)
    → script.completed
    → task.received   (declared dispatch creates an analyst-targeted task)
    → task.claimed
      → agent.invoked
        → agent.completed
          → artifact.created
            → task.completed
  → command.completed

10.3 Direct script flow

UI: kernel.runScript(sessionId, 'check-traceability', { feature: 'X' })
  → script.invoked
  → script.completed   (structured result on the event payload; optional Artifact)

In every flow, artifacts are proposals. Specorator (not the runtime) decides whether and how to apply them to the vault.


11. Integration Points

Specorator (UI)

  • Loads the agentic-workflow package from the user's vault and assembles the typed Workflow Package for startSession
  • Calls runCommand / submitTask / runScript / spawnAgent / stopAgent from the cockpit UI
  • Subscribes to runtime events for cockpit rendering
  • Owns vault I/O after HITL acceptance
  • Supplies the VaultRead capability bridge (permission-filtered)

agentic-workflow

  • Source of the workflow package's contents — agent / skill / command definitions, scripts, templates, docs
  • Consumed as data through Specorator's package adapter
  • NOT executed by the runtime as a DAG

agentonomous

  • Source of agent shape, skills, ticking, persona, lifecycle, snapshot, LLM provider port
  • Hosted by the runtime as ECS entities via a thin adapter
  • Instantiated from agent definitions in the loaded workflow package

sim-ecs (third-party, internal-only)

  • Embedded simulation kernel
  • Not exposed in the public API surface
  • Version-coordinated with agentonomous's optional peer dep range

12. UX Entry Points

12.1 Start Session

User opens a workflow context in Specorator → Specorator assembles the Workflow Package and calls startSession

12.2 Run Command (high-level)

User invokes a slash command from the cockpit (e.g., /spec:requirements) → Specorator calls runCommand → runtime orchestrates script + agent task → events stream back

12.3 Submit Task / Run Script (finer-grained)

For scenarios that don't map to a single declared command, the UI uses submitTask (free-form prompt) or runScript (deterministic operation)

12.4 Spawn / Stop Agent

The user (or the cockpit, programmatically) brings agents into and out of the session as the work progresses

12.5 Inspect and Decide (HITL)

User reviews proposed artifacts → accepts (Specorator writes to vault), edits, refines (resubmits), or rejects

12.6 Inspect Results

User browses past sessions: artifacts, command/script logs, full execution trace


13. Acceptance Criteria

  • A session can be started with a loaded Workflow Package, capability bundle, and optional initial agent team
  • Tasks can be submitted via submitTask and reach task.received state
  • Commands can be invoked via runCommand and resolve to their declared dispatch (script + agent task)
  • Scripts can be invoked via runScript and emit script.completed with structured results
  • Agents can be spawned via spawnAgent and stopped via stopAgent mid-session
  • Per-agent permissions are enforced — disallowed dispatch returns PermissionDeniedError
  • Workflow package validation fails fast with a typed Result for malformed packages
  • Events are emitted for every lifecycle transition with monotonic seq and shared traceId
  • UI can subscribe and render execution
  • Runtime state is queryable
  • Per-listener error isolation holds in both sync and async dispatch
  • npm install specorator-runtime installs the package and its TypeScript types
  • import { RuntimeKernel } from 'specorator-runtime' works without deep or path-based imports
  • npm run verify passes with zero failures and zero any or ts-ignore in production code
  • All public API members have JSDoc documentation
  • All architectural decisions are recorded as ADRs following the agentic-workflow template
  • Cross-package shape parity (SR-06) verified via CI assignability test
  • Runtime never writes to the vault under any code path — invariant verified by test
  • Public API surface contains no ECS primitives, no definition-parser internals, no permission-enforcement internals — verified by surface review

14. Risks

(refreshed for v0.3)

  • Over-engineering too early
  • Tight coupling between runtime and UI
  • Unclear event schema (mitigated by typed discriminated union)
  • Uncontrolled async behavior (mitigated by per-listener error isolation)
  • npm module format incompatibility with Obsidian (mitigated — Vite handles ESM→CJS)
  • Public API surface instability
  • PRD §9 reinterpretation contested (carried from v0.2)
  • sim-ecs version drift with agentonomous (carried from v0.2)
  • HITL invariant violation (carried from v0.2)
  • Workflow package format coupling — if agentic-workflow's .claude/agents/ Markdown shape changes substantively, Specorator's package adapter (not the runtime) absorbs the change. Mitigated by versioning the package format. (NEW in v0.3)
  • Permission semantics drift — if the agent definition Markdown evolves new permission concepts, the runtime's Permission Guard must follow. Mitigated by declaring the runtime's understood permission shape in the package contract. (NEW in v0.3)
  • Untrusted scripts — scripts run as plain Node in V1. If the runtime is ever embedded in a multi-user / untrusted context, sandboxing becomes mandatory (Phase 3+ concern). (NEW in v0.3)
  • API minimalism vs powerrunCommand may be too coarse for some UI flows. Mitigated by also exposing submitTask / runScript / spawnAgent for finer control. (NEW in v0.3)

15. Delivery Plan (V1)

Phase Contents
Phase 1 — Core Skeleton Runtime kernel; embedded sim-ecs world; event bus; session model; capability ports (LLM, VaultRead, Logger, Clock, Rng); workflow package loader (validation + registration); PromptEngineer + LlmGateway (LLM gatekeeper); stub agent adapter; startSession / stopSession / submitTask / getSession
Phase 2 — Live Agents and Commands Agentonomous adapter; full task→claim→tick→artifact pipeline; dynamic spawnAgent / stopAgent; runCommand / runScript with declared dispatch; permission enforcement; task.failed / agent.failed / command.failed lifecycles
Phase 3 — Observability Trace correlation, structured logging, kernel.replay(eventLog), listener-leak tripwire, session snapshot ergonomics
Phase 4 — Integration Connect to Specorator v2.0 and the actual agentic-workflow package adapter Specorator ships; production-ready capability adapters; co-test the full ecosystem story

16. Open Questions

(most resolved by issue #14; remaining open items below)

Resolved by #14:

  • Should sessions be persisted to filesystem? → No; agentonomous handles its own snapshots
  • How to model retries and failures? → Reserved event types; Phase 2
  • How to represent artifacts (files vs memory)? → In-memory proposals only; Specorator owns vault writes
  • How strict should event schemas be? → Typed discriminated union; JSON-round-trippable
  • Module format? → ESM-only
  • Error handling strategy?Result at the public API boundary; shape parity with agentonomous + Specorator #104
  • How does the runtime consume agentic-workflow? → As a typed Workflow Package (operational definitions + scripts + reference content), assembled by the consumer (Specorator) and passed in at startSession
  • How are permissions modelled? → Read from the existing agent definition Markdown; runtime enforces at dispatch
  • Are scripts sandboxed? → No (V1); trusted Node execution; sandbox is a Phase 3+ concern

Still open (handed to Workshop A in #4):

  • Orchestrator Engine scope — the runtime owns and manages an orchestrator engine that tracks tasks across all running autonomous agents. Spec in design; will be attached as a dedicated issue. The orchestrator engine is in-scope for the runtime but not yet specified.
  • The exact typed shape of the Workflow Package contract (agent definition fields, command dispatch declaration, etc.)
  • Cross-bus traceId lifecycle confirmation
  • Complete capability surface for V1 — LLM / VaultRead / Logger / Clock / Rng sufficient, or does the agentic-workflow content imply we need more (e.g., WebFetch, MemoryStore)?
  • agentic-workflow major version pinning for V1
  • sim-ecs version coordination with agentonomous
  • Cockpit subscription pattern — one bus or two?

Critical North Star

This runtime must remain a knowledge-work agent habitat — not a generic agent platform, not a workflow engine. Every decision should be tested against:

"Does this help humans understand, control, and evolve knowledge work?"


Path Forward

Step Issue What it produces
1 #2 — Product presence README, VISION.md, product page (delivered in PR #13)
2 #3 — Project environment Repo skeleton, CI, labels, workshops
2b #14 — Architecture proposal Detailed agent-habitat architecture (v0.7 reflects v0.3 PRD)
3 #4 — Design consolidation Solution proposal, enriched docs, ADRs (consumes #14)
4 #5 — Baseline v0.0.1 Frozen pre-engineering document release
5 #6 — Formal initiation p3.express Group A artifacts, v0.0.2

Steps 1, 2, 2b can run in parallel. Steps 3–5 are sequential.


Changelog

  • v0.4 (2026-05-10): Added §5.7 Prompt Engineering and LLM Gatekeeping — runtime is the sole gatekeeper for all outbound LLM calls; PromptEngineer forms LLMRequest, emits llm.requested / llm.responded, exposes LlmGateway to agents. Added §5.8 Request / Response as the Public Contract mental model. Added FR-12 (Prompt Engineering and LLM Gatekeeping). Added Prompt Engineer row to §9 Architecture Overview. Updated §5.2 event taxonomy to include llm.*. Added PromptEngineer + LlmGateway to Phase 1 in §15. Added Orchestrator Engine open question to §16. Meta: version 0.4, updated 2026-05-10.
  • v0.3 (2026-05-04): Added the Workflow Package concept — agentic-workflow content is loaded as an operational package (agent / skill / command definitions + TS scripts + reference content), not just consulted as reference. Added FR-09 (package loading), FR-10 (command dispatch), FR-11 (script execution). Expanded §7 Runtime API with runCommand, runScript, spawnAgent, stopAgent. Added §9 components: Workflow Package Loader, Command Registry, Script Runner, Permission Guard. Added §10.2 Command flow and §10.3 Direct script flow. Permissions clarified: read from existing agent definition Markdown — no new DSL. Scripts trusted in V1 — no sandbox. New design principle codified in the meta block: abstract complexity from the consumer. Acceptance criteria expanded to include public-API surface review.
  • v0.2 (2026-05-04): Reframed runtime as agent habitat, not workflow interpreter. Reshaped §1, §3, §4, §5, §6, §9, §10. Added SR-06 cross-package shape parity. Detailed in issue #14.
  • v0.1 (2026-05-03): Initial PRD.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kick-offprdroadmap:foundationPhase 0: pre-engineering foundation — PRD, product presence, design workshops, baseline.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions