cyrusagents · Connoropolous · May 15, 2026 · May 15, 2026 · May 15, 2026 · May 15, 2026
diff --git a/CHANGELOG.internal.md b/CHANGELOG.internal.md
@@ -4,6 +4,19 @@ This changelog documents internal development changes, refactors, tooling update
 
 ## [Unreleased]
 
+### Added
+- Added `cyrus-agent-runtime`, a standalone experimental TypeScript package for unified agent session orchestration across harnesses and sandbox providers. It includes normalized session config, transcript envelopes, local and ComputeSDK-backed sandbox abstractions, harness adapters for Claude/Codex/Cursor/Gemini/OpenCode, and focused tests for config, runtime lifecycle, sandbox execution, and transcript parsing.
+- Added live process streaming to `cyrus-agent-runtime`. New optional `RunnerSandbox.streamCommand(command, options)` capability surfaces stdout/stderr chunks to callbacks as they arrive, with `signal: AbortSignal` for cancellation and `input: AsyncIterable<string>` for live stdin. Implemented natively in `LocalSandboxProvider` (via `child_process.spawn`) and for Daytona inside `ComputeSdkSandboxProvider` via a pluggable `NativeStreamAdapter` registry that reaches the underlying `@daytonaio/sdk` Sandbox through ComputeSDK's `ProviderSandbox.getInstance()` escape hatch, using async sessions + `getSessionCommandLogs(onStdout, onStderr)`. User-supplied adapters can be registered via `ComputeSdkSandboxProviderOptions.nativeStreamAdapters` for ComputeSDK providers we don't bundle (E2B, Vercel, Blaxel, Modal, Railway, Runloop, Cloudflare, Codesandbox). `RuntimeAgentSession.start()` now prefers `streamCommand` when `capabilities.streamingProcess` is true, line-buffers chunks across packet boundaries, and emits `TranscriptEvent`s live as the harness CLI produces them. New `CreateAgentSessionConfig.interactiveInput` opt-in flag routes `addMessage()` chunks into the running process's stdin (most one-shot CLIs hang on piped-but-never-closed stdin, so this defaults off). Verified end-to-end against real `codex exec` (events emitted ~8.6s before turn end), the local `child_process.spawn` path (chunks landed at the exact 400ms cadence the child produced them), and real Daytona Claude `stream-json` (system event landed 1.7s before result event over a remote sandbox).
+- Added `folders` and `repositories` to the `cyrus-agent-runtime` session config — two new materialization concepts that are deliberately distinct from existing `volumes`. `RuntimeFolderConfig` exposes a host filesystem folder inside the sandbox (walks the host tree, uploads each file via `SandboxFilesystem.writeFile`, supports an `exclude` glob list) and with `access: "readwrite"` syncs sandbox edits and any newly-created files back to the host folder after the harness command completes. `RuntimeRepositoryConfig` runs `git clone` inside the sandbox at `mountPath` with optional `branch` checkout and `depth` shallow-clone; local-path sources are converted to `file://...` to preserve git semantics, and shallow clones with a branch use `--branch` on the clone itself (since `git checkout` of a non-default branch fails on a shallow clone). Both emit lifecycle transcript events (`folder.materialize.started/completed/failed`, `folder.syncback.started/completed/failed`, `repository.materialize.started/completed/failed`) and run before the package setup commands so any setup that depends on the cloned tree or the mounted folder sees them ready.
+- Added `destroy()` to `AgentSessionResult` in `cyrus-agent-runtime` — equates to ComputeSDK's `ProviderSandbox.destroy()` for ComputeSDK-backed providers (deletes the remote sandbox, releases compute resources) and is a no-op for the local provider. Idempotent. Lets consumers hold only the result, consume the events/result, and tear down without keeping a session reference.
+- Decoupled `AgentSession.stop()` from sandbox destruction. `stop()` now cancels the in-flight harness only — aborts the running process, closes the live event stream, closes the input pipe — and leaves the sandbox alive. Sandbox teardown is the sole responsibility of the new `destroy()` method, which exists symmetrically on both `AgentSession` and `AgentSessionResult` (sharing a one-shot internal teardown promise). `AgentSession.destroy()` also implicitly cancels an in-flight run via `stop()` before releasing the sandbox, so callers don't need a two-step. Decoupling enables future workflows that reuse a warm sandbox across runs (per CYPACK-1209) — a single run's `stop()` no longer destroys shared compute.
+- Added session resume primitives to `cyrus-agent-runtime`. `CreateAgentSessionConfig.resumeHarnessSessionId` is caller-supplied — Claude adapter translates it into `--resume <id>`; Cursor adapter translates it into `--agent-id <id>` for `@cyrus-ai/cursor-runner`. `AgentSessionResult.harnessSessionId` is the new harness-native id observed in this run, captured by `HarnessAdapter.extractSessionId(events)` (implemented for Claude against `system.init.session_id`, for Cursor against `SDKMessage.agent_id`) and surfaced for callers to persist. The caller owns the mapping between its session records and harness-native ids; the runtime does not persist transcripts itself.
+- Added `sandbox.persistentState: { volume, bindingId }` to `cyrus-agent-runtime` — caller-facing abstraction that hides the per-harness state-env-var math. The runtime mounts the caller's volume at a fixed internal path with `bindingId` as the subpath, then calls each adapter's new `buildStateEnv(mountPath)` hook to inject the right env vars so the harness writes its state-dir there. Verified upstream per harness: Claude → `CLAUDE_CONFIG_DIR=${m}/.claude`; Cursor → `CURSOR_DATA_DIR=${m}/.cursor`; Codex → `CODEX_HOME=${m}/.codex`; Gemini → `GEMINI_CLI_HOME=${m}` (CLI appends `.gemini` itself); OpenCode → all four `XDG_*_HOME` dirs under `${m}/.opencode-xdg/{config,data,state,cache}` (no app-specific override exists).
+- Extracted `@cyrus-ai/cursor-runner` as a publishable package — a thin CLI wrapper around `@cursor/sdk` that emits `SDKMessage` JSONL. Lets the agent-runtime cursor adapter consume a typed, version-pinned wire format that we own (no schema drift vs `cursor-agent`).
+- Added `RuntimePlugin` to `cyrus-agent-runtime` — bundles MCP servers + hooks + skills with per-harness materializers translating one declaration into Claude / Cursor / Codex native filesystem state inside the sandbox. The bundled MCP-config path replaces the old standalone `mcps` field on session config.
+- Added Daytona volume mounting (`RuntimeVolumeConfig` with provider-driven `kind: "bind" | "fuse" | "provider"` and `subpath` for per-binding isolation), `destroyWhileInactive` (pauses the underlying sandbox between `run()` calls — Daytona stop/start preserves on-disk state at a few-second resume cost), and Daytona base-snapshot harness binaries (`harness.command` lets adapters spawn snapshot-resident binaries directly; Cursor uses this for `cursor-runner`).
+- Wired `cyrus-edge-worker` `AgentChatSessionHandler` to the agent-runtime: provider chosen from `EdgeConfig.defaultProvider`, MCP servers forwarded via the new plugin shape, Daytona chat sandbox snapshot + custom layout + bypass perms threaded through, `ANTHROPIC_API_KEY` vs `ANTHROPIC_AUTH_TOKEN` precedence fixed.
+
 ## [0.2.52] - 2026-05-13
 
 _No internal-only changes._

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -12,13 +12,15 @@ All notable changes to this project will be documented in this file.
 ### Added
 - **User skills can now be scoped to specific repositories, Linear teams, or Linear labels** — Skills synced from cyrus-hosted with `repositoryIds`, `linearTeamIds`, or `linearLabelIds` are only loaded into sessions whose context matches every populated dimension (AND across dimensions, OR within each list). Unscoped skills continue to load for every session, and old payloads without scope fields keep working as global. Scope is persisted as a `scope.json` sidecar alongside `SKILL.md` and enforced at runtime via the Claude Agent SDK's `skills` option so the model can't see or invoke out-of-scope skills. ([CYPACK-1156](https://linear.app/ceedar/issue/CYPACK-1156), [#1205](https://github.com/cyrusagents/cyrus/pull/1205))
 - **Shared auto-memory across Slack chat sessions** — Slack-triggered chat sessions now share a persistent Claude auto-memory directory at `<cyrusHome>/slack-memory/`, so memory built up in one Slack thread carries over to every other Slack thread. ([CYPACK-1190](https://linear.app/ceedar/issue/CYPACK-1190), [#1199](https://github.com/cyrusagents/cyrus/pull/1199))
+- **Base snapshot for Daytona chat sandboxes via `DAYTONA_SNAPSHOT`** — When `DAYTONA_SNAPSHOT` is set in the environment, Daytona-backed chat sessions seed their sandboxes from that pre-built Daytona snapshot instead of the default base image. When a snapshot is in use the npm-install bootstrap is skipped (the snapshot is expected to ship Claude Code preinstalled) and the CLI defaults to `claude` on `PATH`. Two companion overrides let snapshots use any home layout: `DAYTONA_WORKING_DIR` (default `/home/daytona`) sets the in-sandbox working directory, and `DAYTONA_CLAUDE_CLI_PATH` overrides the `claude` binary path.
 
 ### Fixed
 - **Session Stop hook now actually reminds the agent to ship before stopping** — Replaced the broken Stop-hook return shape (`additionalContext` + `continue: true`, which the Claude Agent SDK silently drops) with the SDK's documented `decision: "block"` + `reason` form. The first stop attempt now blocks and feeds the commit/push/PR reminder back into the next turn; a second stop (with `stop_hook_active === true`) proceeds, preventing infinite loops. ([CYPACK-1204](https://linear.app/ceedar/issue/CYPACK-1204), [#1210](https://github.com/cyrusagents/cyrus/pull/1210))
 - **Slack chat sessions can now read and edit their shared auto-memory** — The shared auto-memory directory (`<cyrusHome>/slack-memory/`) is now included in `allowedDirectories` for chat sessions. Previously, sessions could create new memory files via shell redirects, but `Read`/`Edit`/`Glob` against existing memory files (including `MEMORY.md`) were denied by the home-directory restriction rules, leaving the auto-memory feature half-working. ([CYPACK-1197](https://linear.app/ceedar/issue/CYPACK-1197), [#1206](https://github.com/cyrusagents/cyrus/pull/1206))
 
 ### Changed
 - **Slack mention prompt nudges agents toward `linear_agent_give_feedback` for live child sessions** — When responding in Slack, Cyrus is now told to send mid-flight corrections to a running child agent session via `mcp__cyrus-tools__linear_agent_give_feedback` instead of falling back to `mcp__linear__save_comment`. Produces a stronger signal when correcting work that is already in progress. ([CYPACK-1189](https://linear.app/ceedar/issue/CYPACK-1189), [#1198](https://github.com/cyrusagents/cyrus/pull/1198))
+- **Daytona chat sessions now bypass Claude's permission prompts** — Since the Daytona sandbox is itself the isolation boundary, blocking on per-tool prompts (which no user can answer) was preventing the agent from running shell commands inside the sandbox. Local sessions still prompt as before.
 
 ### Packages
 

diff --git a/apps/cli/src/services/WorkerService.ts b/apps/cli/src/services/WorkerService.ts
@@ -222,6 +222,7 @@ export class WorkerService {
 					| "codex"
 					| "cursor"
 					| undefined) || edgeConfig.defaultRunner,
+			defaultProvider: edgeConfig.defaultProvider,
 			issueUpdateTrigger: edgeConfig.issueUpdateTrigger,
 			promptDefaults: edgeConfig.promptDefaults,
 			linearWorkspaces: edgeConfig.linearWorkspaces,

diff --git a/package.json b/package.json
@@ -46,6 +46,13 @@
 		"onlyBuiltDependencies": [
 			"sqlite3"
 		],
+		"packageExtensions": {
+			"@daytonaio/sdk": {
+				"dependencies": {
+					"tslib": "^2"
+				}
+			}
+		},
 		"overrides": {
 			"jws": ">=4.0.1",
 			"@modelcontextprotocol/sdk": ">=1.26.0",
@@ -54,11 +61,6 @@
 			"vite": ">=7.1.11",
 			"zod": "4.3.6",
 			"hono": ">=4.12.18",
-			"fast-uri": ">=3.1.2",
-			"ip-address": ">=10.1.1",
-			"@anthropic-ai/sdk": ">=0.91.1",
-			"@opentelemetry/sdk-node": ">=0.217.0",
-			"@opentelemetry/exporter-prometheus": ">=0.217.0",
 			"@hono/node-server": ">=1.19.10",
 			"rollup": ">=4.59.0",
 			"flatted": ">=3.4.0",
@@ -72,7 +74,20 @@
 			"diff": ">=8.0.3",
 			"@tootallnate/once": ">=3.0.1",
 			"@isaacs/brace-expansion": ">=5.0.1",
-			"tar": ">=7.5.11"
+			"tar": ">=7.5.11",
+			"fast-uri": ">=3.1.2",
+			"ip-address": ">=10.1.1",
+			"@opentelemetry/sdk-node": ">=0.217.0",
+			"@opentelemetry/exporter-prometheus": ">=0.217.0",
+			"@opentelemetry/otlp-transformer>protobufjs": ">=8.0.2",
+			"protobufjs": ">=7.5.8",
+			"ws": ">=8.20.1",
+			"brace-expansion": ">=5.0.6",
+			"@anthropic-ai/sdk": ">=0.91.1",
+			"@daytonaio/sdk": ">=0.175.0"
+		},
+		"patchedDependencies": {
+			"@daytonaio/sdk@0.175.0": "patches/@daytonaio__sdk@0.175.0.patch"
 		}
 	},
 	"lint-staged": {

diff --git a/packages/agent-runtime/ASSUMPTIONS.md b/packages/agent-runtime/ASSUMPTIONS.md
@@ -0,0 +1,53 @@
+# Agent Runtime Assumptions
+
+This package is intentionally built as a new standalone runtime layer with minimal dependency on the existing Cyrus runner packages.
+
+## Product Contract
+
+- The package exposes a TypeScript library API first. It does not ship a daemon or CLI in this iteration.
+- A session has one Cyrus-owned `sessionId`. Harness-native session identifiers are represented as transcript metadata when a harness emits them.
+- Transcript events preserve raw harness JSON whenever possible and wrap it in a stable runtime envelope.
+- `addMessage()` queues messages for harnesses that do not support interactive stdin yet. The queue is visible and testable, but delivery is capability-gated.
+- `interrupt()` is a soft user-message interruption when supported. `stop()` is lifecycle cancellation and attempts to terminate the running process.
+
+## Harness Contract
+
+- Claude, Codex, Cursor, Gemini, PI, and OpenCode are represented as harness adapters.
+- Claude, Codex, Cursor, and Gemini command-line conventions are modeled from locally available CLIs and existing public behavior.
+- PI and OpenCode are provisional adapters. Their commands and JSON formats are assumptions until real CLI transcripts are supplied.
+- Harness adapters own command construction and transcript parsing. They do not own sandbox provisioning.
+
+## Sandbox Contract
+
+- Local execution is modeled as a sandbox provider. This keeps local and remote execution behind the same conceptual interface.
+- ComputeSDK is the vendor abstraction for remote sandbox providers.
+- The common ComputeSDK `runCommand()` API is treated as sufficient for one-shot harness runs.
+- Streaming process execution is modeled as a capability, but is not assumed for every ComputeSDK provider. Full interactive harness support requires a provider-specific streaming process implementation.
+- Volumes, FUSE mounts, snapshots, ports, and network egress are represented in config types even when a provider cannot enforce them yet.
+- `RuntimeVolumeConfig.subpath` carries the provider-defined prefix used to scope a shared volume. The Daytona Volumes pattern is the reference use case; other providers map `subpath` as appropriate.
+
+## Session Resume Contract
+
+- The runtime exposes two resume primitives. The caller (Cyrus's `AgentSessionManager`) owns the mapping between its session records and harness-native session ids.
+  - `CreateAgentSessionConfig.resumeHarnessSessionId`: caller-supplied prior id. Harness adapters translate it into the right CLI flag (e.g. `--resume <id>` for Claude).
+  - `AgentSessionResult.harnessSessionId`: the new harness-native id observed in this run's transcript, surfaced for the caller to persist for next time.
+- Harness adapters extract the harness-native session id from transcript events via `extractSessionId(events)`. Claude's `system.init.session_id` is the canonical example.
+- The runtime does not persist transcripts itself. For the harness to actually see prior conversation on resume, the caller must arrange durable storage for the harness's config dir — for example by attaching a `RuntimeVolumeConfig` (Daytona Volumes are the reference) mounted at the harness's config path and setting the matching env var (`CLAUDE_CONFIG_DIR` for Claude).
+- Daytona's ComputeSDK provider was smoke-tested with a remote working directory of `/home/daytona`; `/workspace` should not be assumed portable across providers.
+- Cursor Agent was smoke-tested inside Daytona by installing the CLI with `curl https://cursor.com/install -fsS | bash` and running `/home/daytona/.local/bin/cursor-agent` with `CURSOR_API_KEY` provided as a secret environment variable.
+- Codex Agent was smoke-tested inside Daytona far enough to authenticate and start a turn by materializing `~/.codex/auth.json` as a sensitive runtime file. Passing only `OPENAI_API_KEY` from the local Codex auth file produced a remote 401. The authenticated Codex turn later hit the account usage limit.
+- Claude Code was smoke-tested inside Daytona by installing the CLI with a user-local npm prefix and running `/home/daytona/.npm-global/bin/claude` with `CLAUDE_CODE_OAUTH_TOKEN` provided as a secret environment variable. The remote session emitted `system`/`assistant`/`result` events and completed successfully.
+
+## Security Contract
+
+- `env` is safe-to-log configuration. `secrets` must be redacted from transcript and error metadata.
+- Secrets are passed into process environments only at execution time.
+- Tool permissions are represented as declarative runtime config and translated into harness-native flags where currently known.
+- Network egress policy is a declarative provider option in this iteration. Enforcement depends on the selected sandbox provider.
+
+## Feedback Loops
+
+- Config schema tests prove the public contract accepts and rejects expected shapes.
+- Local sandbox tests prove the local provider can write files and execute commands.
+- Harness adapter tests prove command construction and transcript parsing.
+- Session runtime tests prove event emission, queueing, stop behavior, and result propagation.