Add sandboxed agent runtime package#1220
Closed
Connoropolous wants to merge 6 commits into
Closed
Conversation
…dboxes Add an optional `streamCommand(command, options)` capability to `RunnerSandbox`, with `onStdout` / `onStderr` chunk callbacks, an `AbortSignal` for cancellation, and an `AsyncIterable<string> input` option for live stdin. Local provider implements it via `child_process.spawn`; Daytona is reached through a pluggable `NativeStreamAdapter` registry that unwraps ComputeSDK's `ProviderSandbox.getInstance()` to the native `@daytonaio/sdk` Sandbox and uses async sessions + `getSessionCommandLogs(onStdout, onStderr)`. `RuntimeAgentSession.start()` now prefers `streamCommand` when `capabilities.streamingProcess` is true, line-buffers chunks across packet boundaries, and emits `TranscriptEvent`s as the harness CLI produces them. New `interactiveInput` opt-in routes `addMessage()` into the running process's stdin (default off — most one-shot CLIs block on a piped-but-never-closed stdin). Verified end-to-end: - local `spawn`: chunks land at the exact 400ms cadence the child emits - real `codex exec` via `createAgentSession`: events emitted ~8.6s before turn end - real Daytona Claude `stream-json`: system event landed 1.7s before result event over a remote sandbox
…config Add two materialization concepts to `CreateAgentSessionConfig`, deliberately distinct from the existing `volumes` (provider-attached persistent storage): - `RuntimeFolderConfig` — exposes a host filesystem folder inside the sandbox. Walks the host tree and uploads each file via `SandboxFilesystem.writeFile`. Supports `exclude` globs. With `access: "readwrite"` the runtime syncs sandbox edits and any newly-created files back to the host folder after the harness command completes. - `RuntimeRepositoryConfig` — runs `git clone` inside the sandbox at `mountPath` with optional `branch` checkout and `depth` shallow-clone. Local-path sources are rewritten to `file://...` to preserve git semantics. Shallow clones with a branch use `--branch` on the clone itself, since `git checkout` of a non-default branch fails after a shallow clone. Both emit lifecycle transcript events (`folder.materialize.*`, `folder.syncback.*`, `repository.materialize.*`) and run after files but before package setup commands, so setup steps that depend on the cloned tree or the mounted folder see them ready. 27 tests pass (5 new): one materializer unit test per concept and one runtime-level integration test verifying that the session wires each through to the right sandbox calls and emits the right events.
Equates to ComputeSDK's ProviderSandbox.destroy() for ComputeSDK-backed providers (deletes the remote sandbox, releases compute resources) and is a no-op for the local provider. Lets a caller hold only the result object, consume events/result, then tear down without keeping a reference to the session. Idempotent — backed by a one-shot destroy promise on the session that both `AgentSession.stop()` and `AgentSessionResult.destroy()` share, so callers can call either or both in any order without double-destroying the underlying ComputeSDK / local sandbox. Verified with a new test that asserts: - the returned result exposes destroy() - calling result.destroy() invokes sandbox.destroy() exactly once - calling result.destroy() twice is a no-op the second time - calling session.stop() after result.destroy() does not double-destroy
`stop()` and `destroy()` were doing two unrelated things bundled into one method. Split them. `stop()` now cancels the in-flight run only — aborts the harness process, closes the live event stream, closes the input pipe — and leaves the sandbox alive. This enables future workflows that reuse a warm sandbox across runs (per CYPACK-1209): a single run's `stop()` no longer destroys shared compute. `destroy()` is the sole sandbox-release path. It exists symmetrically on both `AgentSession` and `AgentSessionResult` (sharing a one-shot internal teardown promise). `AgentSession.destroy()` also implicitly cancels an in-flight run via `stop()` before releasing the sandbox, so callers don't need a two-step. Pre-1.0 package, clean break — no consumers to migrate.
3 tasks
Contributor
Author
|
Superseded by #1229 — same foundational Closing here, please review at #1229. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
streamCommandcapability onRunnerSandboxwithonStdout/onStderrcallbacks,AbortSignalcancellation, andAsyncIterable<string>stdin. Local provider streams natively viachild_process.spawn; Daytona is reached through a pluggableNativeStreamAdapterregistry that unwraps ComputeSDK'sProviderSandbox.getInstance()to drive@daytonaio/sdkasync sessions +getSessionCommandLogs(onStdout, onStderr).RuntimeAgentSession.start()now prefers streaming when available, line-buffers across packet boundaries, and emitsTranscriptEvents live. New opt-ininteractiveInputflag routesaddMessage()chunks into the running process's stdin.RuntimeFolderConfiguploads a host folder into the sandbox and (withaccess: "readwrite") syncs sandbox edits and newly-created files back to the host after the harness completes.RuntimeRepositoryConfigrunsgit cloneinside the sandbox atmountPathwith optionalbranchcheckout anddepthshallow-clone; local-path sources are rewritten tofile://...to preserve git semantics, and shallow clones with a branch use--branchon the clone itself.stop()anddestroy()decoupled.stop()cancels the in-flight run only — does NOT release the sandbox.destroy()is the sole sandbox-release path, exposed symmetrically onAgentSessionandAgentSessionResult(sharing one internal one-shot teardown).AgentSession.destroy()implicitly cancels an in-flight run viastop()first. Equates to ComputeSDK'sProviderSandbox.destroy()for ComputeSDK-backed providers. Lets future workflows reuse a warm sandbox across runs (a single run'sstop()no longer destroys shared compute).Original Criteria Coverage
Session:
Config:
Interface / Runtime abstractions:
AsyncIterable<string>(opt-ininteractiveInput)destroy()on bothAgentSessionandAgentSessionResult(ComputeSDK-aligned) — sole sandbox-release pathPurpose:
Infra:
NativeStreamAdapterregistry lets each ComputeSDK provider reach its native streaming primitives without coupling agent-runtime to any specific provider SDKTranscript Event Flow
runCommand(...)is the buffered one-shot path;streamCommand(...)is the live chunk-delivery path. Adapters checkcapabilities.streamingProcessto pick.TranscriptEventenvelopes.RuntimeAgentSessionowns emission: it emitssetup.*,file.write.*,folder.materialize.*/folder.syncback.*,repository.materialize.*, parsed assistant/tool/result events, and the finalresultevent — as they arrive when the sandbox supports streaming.@daytonaio/sdkSandbox through the ComputeSDKProviderSandbox.getInstance()escape hatch and uses async sessions +getSessionCommandLogs(onStdout, onStderr)callbacks. Other ComputeSDK providers can ship their ownNativeStreamAdapterviaComputeSdkSandboxProviderOptions.nativeStreamAdapters.Filesystem Concepts at the Session Level
Three deliberately distinct concepts, each with its own materializer + transcript lifecycle:
RuntimeFileConfig[]) — small one-off inline strings written into the sandbox before setup. Supportssensitive: truefor redacted transcript logging.RuntimeFolderConfig[]) — host filesystem folders bind/copy-synced into the sandbox; host is source of truth.access: "read"copies in only;access: "readwrite"syncs sandbox edits and newly-created files back to the host after the harness completes. Supportsexcludeglobs.RuntimeRepositoryConfig[]) — git-driven trees materialized viagit cloneinside the sandbox atmountPath. Supports optionalbranchcheckout (including SHAs/tags on full clones) anddepthshallow-clone (auto-1foraccess: "read"). Local-path sources are converted tofile://...to preserve git semantics.RuntimeVolumeConfig[]) — provider-attached persistent storage with a lifecycle independent of the sandbox (Docker volumes, EBS volumes, FUSE mounts). Distinct from folders because the source of truth is the provider, not the host filesystem.Lifecycle: stop vs. destroy
Two single-purpose operations:
stop(reason?)— cancel the in-flight run. Aborts the running harness process, closes the live event stream, closes the input pipe. Sandbox stays alive. Idempotent. Available onAgentSessiononly (no meaning post-run).destroy()— release the sandbox (ComputeSDKProviderSandbox.destroy()for remote, no-op for local). If a run is in flight, callsstop()first so the harness terminates cleanly before teardown. Idempotent. Available on bothAgentSessionandAgentSessionResult, sharing the same one-shot internal teardown promise — so calling either or both in any order is safe.This decoupling enables the future EnvironmentFactory model (CYPACK-1209) where multiple runs share one warm sandbox: a single run's
stop()no longer destroys shared compute.Validation
streamCommand): installed Claude Code remotely, capturedsystem/assistant/resulttranscript events, receiveddaytona claude event smoke ok.createAgentSession+codex exec:thread.started/turn.startedemitted at ~172ms;turn.completedat ~8861ms (8.6s spread).stream-jsonsystemevent landed 1.7s beforeresultevent over a remote sandbox.createAgentSession: 3 local notes uploaded to/home/daytona/notesviafolder.materialize.*, Claude Code installed via 3 setup commands,claudeinvoked with stream-json output, 20 transcript events emitted live over a 7.2s harness window, final extracted result"alpha=Alpha Note, beta=Beta Note, gamma=Gamma Note",result.destroy()invoked twice (first destroyed the Daytona sandbox, second was a clean no-op).excludeglobs honored; read-write sync-back picks up both edits and newly-created files.git cloneruns inside the sandbox, branch checkout works on both full and shallow clones.stop()/destroy()decoupling tests:stop()provably does NOT destroy the sandbox;destroy()is the only release path; both surfaces (AgentSession.destroy()andAgentSessionResult.destroy()) share a one-shot;AgentSession.destroy()mid-run cancels the in-flight harness before releasing.Follow-ups (not in this PR)
rootPathand inlinecontentsupplied, runtime transfers local paths to remote sandbox, our owncyrus-plugin.jsonmanifest format).RuntimePluginwill bundle MCP servers + skills + hooks + commands + agents + contextFile + permissions. Per-harnessPluginMaterializers translate one declaration into Claude / Cursor / Codex / Gemini native filesystem state inside the sandbox. To be implemented on this branch in a follow-up commit.EnvironmentFactoryfor run/environment split — captured as a deferred speculation in CYPACK-1209. SplitsCreateAgentSessionConfiginto a per-callRunConfigand a cached-by-hashEnvironmentConfig, amortizing expensive remote-sandbox spin-up across runs.envapply to setup commands or only to the harness? Found while writing the Daytona+Claude runtime spike: overridingPATHin session env breaks setup commands that need to find a binary via the container's defaultPATH. Todayenvapplies to both; two reasonable alternatives are (a) setup gets only sandbox-default env, (b) split intosetupEnvandharnessEnv. To be filed as a Linear ticket.