cyrusagents · Connoropolous · May 15, 2026 · May 15, 2026 · May 15, 2026 · May 15, 2026
diff --git a/CHANGELOG.internal.md b/CHANGELOG.internal.md
@@ -4,6 +4,13 @@ This changelog documents internal development changes, refactors, tooling update
 
 ## [Unreleased]
 
+### Added
+- Added `cyrus-agent-runtime`, a standalone experimental TypeScript package for unified agent session orchestration across harnesses and sandbox providers. It includes normalized session config, transcript envelopes, local and ComputeSDK-backed sandbox abstractions, harness adapters for Claude/Codex/Cursor/Gemini plus provisional PI/OpenCode adapters, and focused tests for config, runtime lifecycle, sandbox execution, and transcript parsing.
+- Added live process streaming to `cyrus-agent-runtime`. New optional `RunnerSandbox.streamCommand(command, options)` capability surfaces stdout/stderr chunks to callbacks as they arrive, with `signal: AbortSignal` for cancellation and `input: AsyncIterable<string>` for live stdin. Implemented natively in `LocalSandboxProvider` (via `child_process.spawn`) and for Daytona inside `ComputeSdkSandboxProvider` via a pluggable `NativeStreamAdapter` registry that reaches the underlying `@daytonaio/sdk` Sandbox through ComputeSDK's `ProviderSandbox.getInstance()` escape hatch, using async sessions + `getSessionCommandLogs(onStdout, onStderr)`. User-supplied adapters can be registered via `ComputeSdkSandboxProviderOptions.nativeStreamAdapters` for ComputeSDK providers we don't bundle (E2B, Vercel, Blaxel, Modal, Railway, Runloop, Cloudflare, Codesandbox). `RuntimeAgentSession.start()` now prefers `streamCommand` when `capabilities.streamingProcess` is true, line-buffers chunks across packet boundaries, and emits `TranscriptEvent`s live as the harness CLI produces them. New `CreateAgentSessionConfig.interactiveInput` opt-in flag routes `addMessage()` chunks into the running process's stdin (most one-shot CLIs hang on piped-but-never-closed stdin, so this defaults off). Verified end-to-end against real `codex exec` (events emitted ~8.6s before turn end), the local `child_process.spawn` path (chunks landed at the exact 400ms cadence the child produced them), and real Daytona Claude `stream-json` (system event landed 1.7s before result event over a remote sandbox).
+- Added `folders` and `repositories` to the `cyrus-agent-runtime` session config — two new materialization concepts that are deliberately distinct from existing `volumes`. `RuntimeFolderConfig` exposes a host filesystem folder inside the sandbox (walks the host tree, uploads each file via `SandboxFilesystem.writeFile`, supports an `exclude` glob list) and with `access: "readwrite"` syncs sandbox edits and any newly-created files back to the host folder after the harness command completes. `RuntimeRepositoryConfig` runs `git clone` inside the sandbox at `mountPath` with optional `branch` checkout and `depth` shallow-clone; local-path sources are converted to `file://...` to preserve git semantics, and shallow clones with a branch use `--branch` on the clone itself (since `git checkout` of a non-default branch fails on a shallow clone). Both emit lifecycle transcript events (`folder.materialize.started/completed/failed`, `folder.syncback.started/completed/failed`, `repository.materialize.started/completed/failed`) and run before the package setup commands so any setup that depends on the cloned tree or the mounted folder sees them ready.
+- Added `destroy()` to `AgentSessionResult` in `cyrus-agent-runtime` — equates to ComputeSDK's `ProviderSandbox.destroy()` for ComputeSDK-backed providers (deletes the remote sandbox, releases compute resources) and is a no-op for the local provider. Idempotent. Lets consumers hold only the result, consume the events/result, and tear down without keeping a session reference.
+- Decoupled `AgentSession.stop()` from sandbox destruction. `stop()` now cancels the in-flight harness only — aborts the running process, closes the live event stream, closes the input pipe — and leaves the sandbox alive. Sandbox teardown is the sole responsibility of the new `destroy()` method, which exists symmetrically on both `AgentSession` and `AgentSessionResult` (sharing a one-shot internal teardown promise). `AgentSession.destroy()` also implicitly cancels an in-flight run via `stop()` before releasing the sandbox, so callers don't need a two-step. Decoupling enables future workflows that reuse a warm sandbox across runs (per CYPACK-1209) — a single run's `stop()` no longer destroys shared compute.
+
 ## [0.2.50] - 2026-04-30
 
 ### Added

diff --git a/package.json b/package.json
@@ -53,7 +53,7 @@
 			"qs": ">=6.14.2",
 			"vite": ">=7.1.11",
 			"zod": "4.3.6",
-			"hono": ">=4.12.7",
+			"hono": ">=4.12.18",
 			"@hono/node-server": ">=1.19.10",
 			"rollup": ">=4.59.0",
 			"flatted": ">=3.4.0",
@@ -67,7 +67,14 @@
 			"diff": ">=8.0.3",
 			"@tootallnate/once": ">=3.0.1",
 			"@isaacs/brace-expansion": ">=5.0.1",
-			"tar": ">=7.5.11"
+			"tar": ">=7.5.11",
+			"fast-uri": ">=3.1.2",
+			"ip-address": ">=10.1.1",
+			"@opentelemetry/sdk-node": ">=0.217.0",
+			"@opentelemetry/exporter-prometheus": ">=0.217.0",
+			"@opentelemetry/otlp-transformer>protobufjs": ">=8.0.2",
+			"@anthropic-ai/sdk": ">=0.91.1",
+			"@daytonaio/sdk": ">=0.175.0"
 		}
 	},
 	"lint-staged": {

diff --git a/packages/agent-runtime/ASSUMPTIONS.md b/packages/agent-runtime/ASSUMPTIONS.md
@@ -0,0 +1,44 @@
+# Agent Runtime Assumptions
+
+This package is intentionally built as a new standalone runtime layer with minimal dependency on the existing Cyrus runner packages.
+
+## Product Contract
+
+- The package exposes a TypeScript library API first. It does not ship a daemon or CLI in this iteration.
+- A session has one Cyrus-owned `sessionId`. Harness-native session identifiers are represented as transcript metadata when a harness emits them.
+- Transcript events preserve raw harness JSON whenever possible and wrap it in a stable runtime envelope.
+- `addMessage()` queues messages for harnesses that do not support interactive stdin yet. The queue is visible and testable, but delivery is capability-gated.
+- `interrupt()` is a soft user-message interruption when supported. `stop()` is lifecycle cancellation and attempts to terminate the running process.
+
+## Harness Contract
+
+- Claude, Codex, Cursor, Gemini, PI, and OpenCode are represented as harness adapters.
+- Claude, Codex, Cursor, and Gemini command-line conventions are modeled from locally available CLIs and existing public behavior.
+- PI and OpenCode are provisional adapters. Their commands and JSON formats are assumptions until real CLI transcripts are supplied.
+- Harness adapters own command construction and transcript parsing. They do not own sandbox provisioning.
+
+## Sandbox Contract
+
+- Local execution is modeled as a sandbox provider. This keeps local and remote execution behind the same conceptual interface.
+- ComputeSDK is the vendor abstraction for remote sandbox providers.
+- The common ComputeSDK `runCommand()` API is treated as sufficient for one-shot harness runs.
+- Streaming process execution is modeled as a capability, but is not assumed for every ComputeSDK provider. Full interactive harness support requires a provider-specific streaming process implementation.
+- Volumes, FUSE mounts, snapshots, ports, and network egress are represented in config types even when a provider cannot enforce them yet.
+- Daytona's ComputeSDK provider was smoke-tested with a remote working directory of `/home/daytona`; `/workspace` should not be assumed portable across providers.
+- Cursor Agent was smoke-tested inside Daytona by installing the CLI with `curl https://cursor.com/install -fsS | bash` and running `/home/daytona/.local/bin/cursor-agent` with `CURSOR_API_KEY` provided as a secret environment variable.
+- Codex Agent was smoke-tested inside Daytona far enough to authenticate and start a turn by materializing `~/.codex/auth.json` as a sensitive runtime file. Passing only `OPENAI_API_KEY` from the local Codex auth file produced a remote 401. The authenticated Codex turn later hit the account usage limit.
+- Claude Code was smoke-tested inside Daytona by installing the CLI with a user-local npm prefix and running `/home/daytona/.npm-global/bin/claude` with `CLAUDE_CODE_OAUTH_TOKEN` provided as a secret environment variable. The remote session emitted `system`/`assistant`/`result` events and completed successfully.
+
+## Security Contract
+
+- `env` is safe-to-log configuration. `secrets` must be redacted from transcript and error metadata.
+- Secrets are passed into process environments only at execution time.
+- Tool permissions are represented as declarative runtime config and translated into harness-native flags where currently known.
+- Network egress policy is a declarative provider option in this iteration. Enforcement depends on the selected sandbox provider.
+
+## Feedback Loops
+
+- Config schema tests prove the public contract accepts and rejects expected shapes.
+- Local sandbox tests prove the local provider can write files and execute commands.
+- Harness adapter tests prove command construction and transcript parsing.
+- Session runtime tests prove event emission, queueing, stop behavior, and result propagation.
diff --git a/packages/agent-runtime/VALIDATION.md b/packages/agent-runtime/VALIDATION.md
@@ -0,0 +1,223 @@
+# Agent Runtime Validation
+
+## Automated Checks
+
+Run from the repository root:
+
+```bash
+pnpm --filter cyrus-agent-runtime typecheck
+pnpm --filter cyrus-agent-runtime test:run
+pnpm --filter cyrus-agent-runtime build
+```
+
+Current coverage:
+
+- Harness command construction and transcript parsing for Claude, Codex, Cursor, Gemini, PI, and OpenCode.
+- Local sandbox filesystem and command execution.
+- ComputeSDK sandbox wrapper with fake provider.
+- Session lifecycle, queued messages, setup commands, transcript events, and result extraction.
+
+## Real Local Harness Smoke
+
+This validates `AgentRuntime`, the local sandbox provider, real `codex exec --json`, transcript event parsing, and result extraction.
+
+```bash
+node --input-type=module -e "
+  import { createAgentSession } from './packages/agent-runtime/dist/index.js';
+  const session = await createAgentSession({
+    sessionId: 'smoke-codex',
+    harness: { kind: 'codex', model: 'gpt-5.2' },
+    userPrompt: 'Reply exactly: runtime smoke ok',
+    sandbox: { provider: 'local', workingDirectory: process.cwd() }
+  });
+  const result = await session.start();
+  console.log(JSON.stringify({
+    success: result.success,
+    result: result.result,
+    eventCount: result.events.length
+  }));
+"
+```
+
+Observed result:
+
+```json
+{"success":true,"result":"runtime smoke ok","eventCount":4}
+```
+
+## Real Daytona Harness Smoke
+
+This validates the full remote path: `AgentRuntime`, real ComputeSDK Daytona provider, remote sandbox create/destroy, declarative setup commands inside the sandbox, remote Cursor Agent install, real `cursor-agent --print --output-format stream-json`, transcript events emitted by the agent session running inside Daytona, and result extraction.
+
+Prerequisites:
+
+- `DAYTONA_API_KEY` in the environment.
+- `CURSOR_API_KEY` in the environment.
+- The package has been built with `pnpm --filter cyrus-agent-runtime build`.
+
+Run from `packages/agent-runtime`:
+
+```bash
+node --input-type=module - <<'JS'
+import { daytona } from '@computesdk/daytona';
+import { createAgentSession } from './dist/index.js';
+import { createComputeSdkSandboxProvider } from './dist/sandbox/compute-sdk.js';
+
+const provider = createComputeSdkSandboxProvider({
+  compute: daytona({ apiKey: process.env.DAYTONA_API_KEY, timeout: 300000 }),
+});
+const transcriptKinds = [];
+const transcriptRawTypes = [];
+let sandboxToDestroy;
+const trackingProvider = {
+  provider: 'daytona',
+  async create(config) {
+    const sandbox = await provider.create(config);
+    sandboxToDestroy = sandbox;
+    return sandbox;
+  },
+};
+
+try {
+  const session = await createAgentSession(
+    {
+      sessionId: 'daytona-cursor-smoke',
+      harness: {
+        kind: 'cursor',
+        command: '/home/daytona/.local/bin/cursor-agent',
+      },
+      userPrompt: 'Reply exactly: daytona cursor event smoke ok',
+      env: {
+        PATH: '/home/daytona/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
+      },
+      secrets: {
+        CURSOR_API_KEY: process.env.CURSOR_API_KEY,
+      },
+      packages: {
+        commands: [
+          'curl https://cursor.com/install -fsS | bash',
+          '/home/daytona/.local/bin/cursor-agent --version',
+        ],
+      },
+      sandbox: {
+        provider: 'daytona',
+        name: `agent-runtime-cursor-${Date.now()}`,
+        workingDirectory: '/home/daytona',
+        timeoutMs: 300000,
+        metadata: { purpose: 'agent-runtime-cursor-event-smoke' },
+      },
+    },
+    {
+      sandboxProviders: { daytona: trackingProvider },
+      callbacks: {
+        onTranscriptEvent(event) {
+          transcriptKinds.push(event.kind);
+          if (event.raw && typeof event.raw === 'object' && 'type' in event.raw) {
+            transcriptRawTypes.push(event.raw.type);
+          }
+        },
+      },
+    },
+  );
+
+  const result = await session.start();
+  console.log(JSON.stringify({
+    success: result.success,
+    result: result.result,
+    eventCount: result.events.length,
+    transcriptKinds,
+    transcriptRawTypes,
+    sandboxId: sandboxToDestroy?.sandboxId,
+  }));
+} finally {
+  if (sandboxToDestroy) {
+    await sandboxToDestroy.destroy();
+  }
+}
+JS
+```
+
+Observed result:
+
+```json
+{
+  "success": true,
+  "result": "daytona cursor event smoke ok",
+  "eventCount": 8,
+  "transcriptKinds": [
+    "setup.started",
+    "setup.completed",
+    "setup.started",
+    "setup.completed",
+    "system",
+    "user",
+    "assistant",
+    "result"
+  ],
+  "transcriptRawTypes": ["system", "user", "assistant", "result"]
+}
+```
+
+## Real Daytona Codex Auth Probe
+
+Codex was validated inside Daytona through runtime-managed sensitive file materialization:
+
+- `~/.codex/auth.json` was written with `sensitive: true`, and transcript events redacted the content.
+- `@openai/codex` installed successfully inside Daytona.
+- `codex exec --json --skip-git-repo-check` emitted `thread.started` and `turn.started`.
+- Passing only `OPENAI_API_KEY` from local Codex auth produced a remote 401.
+- Using `~/.codex/auth.json` authenticated, but the turn hit the account usage limit before completion.
+
+Observed authenticated-but-limited result:
+
+```json
+{
+  "success": false,
+  "exitCode": 1,
+  "events": [
+    {
+      "kind": "error",
+      "raw": {
+        "type": "error",
+        "message": "You've hit your usage limit..."
+      }
+    },
+    {
+      "kind": "turn.failed"
+    }
+  ]
+}
+```
+
+## Real Daytona Claude Smoke
+
+Claude Code was validated inside Daytona with an explicit portable Claude Code OAuth token provided as a secret environment variable:
+
+- `@anthropic-ai/claude-code` installed successfully with a user-local npm prefix.
+- `claude --version` returned `2.1.142 (Claude Code)`.
+- `claude -p ... --output-format stream-json --verbose` emitted `system`, `assistant`, and `result` events inside Daytona.
+- The remote Claude session completed successfully with the exact requested result.
+
+Observed runtime result:
+
+```json
+{
+  "success": true,
+  "exitCode": 0,
+  "result": "daytona claude event smoke ok",
+  "eventCount": 9,
+  "eventKinds": [
+    "setup.started",
+    "setup.completed",
+    "setup.started",
+    "setup.completed",
+    "setup.started",
+    "setup.completed",
+    "system",
+    "assistant",
+    "result",
+    "stop.requested"
+  ],
+  "transcriptKinds": ["system", "assistant", "result"]
+}
+```
diff --git a/packages/agent-runtime/package.json b/packages/agent-runtime/package.json
@@ -0,0 +1,33 @@
+{
+	"name": "cyrus-agent-runtime",
+	"version": "0.2.51",
+	"description": "Unified agent harness runtime with pluggable sandbox providers",
+	"type": "module",
+	"main": "dist/index.js",
+	"types": "dist/index.d.ts",
+	"files": [
+		"dist",
+		"ASSUMPTIONS.md",
+		"VALIDATION.md"
+	],
+	"scripts": {
+		"build": "tsc",
+		"dev": "tsc --watch",
+		"test": "vitest",
+		"test:run": "vitest run --passWithNoTests",
+		"typecheck": "tsc --noEmit"
+	},
+	"dependencies": {
+		"@computesdk/daytona": "^1.7.26",
+		"computesdk": "^4.0.0",
+		"zod": "^4.3.6"
+	},
+	"devDependencies": {
+		"@types/node": "^20.0.0",
+		"typescript": "^5.3.3",
+		"vitest": "^3.1.4"
+	},
+	"publishConfig": {
+		"access": "public"
+	}
+}