Skip to content

canyonroad/agentsh-opencomputer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agentsh + OpenComputer

Runtime security governance for AI agents using agentsh v0.19.0 with OpenComputer sandboxes.

Status: Build pipeline complete. 27/27 tests pass against a real OC checkpoint. Runtime command/network policy enforcement does NOT yet work on OC — see Findings for the structural reason and what would need to change. This repo ships the working pieces and documents the gap so it's actionable.

TL;DR

  • What works: Image build (declarative Image API → oc-agentsh checkpoint with on-fork patch). Static policy evaluation (agentsh debug policy-test). Capability detection (agentsh detect reports 100/100 on OC, with caveats below). Kernel-level file enforcement (FUSE + landlock). Fork-on-demand in ~5 seconds.
  • What doesn't work: Runtime command and network policy enforcement. agentsh's shell-shim model assumes commands traverse /bin/bash; the OC SDK's sbx.exec.run execs commands directly via the OC daemon and bypasses the shim entirely. Setting AGENTSH_SHIM_FORCE=1 does not change this because sbx.exec.run never invokes bash.
  • The contribution of this repo is the working build pipeline plus a precise reproduction of the gap. Closing it requires either (a) the OC SDK opt-in to route exec through /bin/bash, or (b) agentsh adding a non-shim enforcement path for OC's exec model (e.g., a server-side seccomp filter installed by the patch, independent of the shim).

Why agentsh + OpenComputer?

OpenComputer provides persistent VMs that hibernate when idle and wake in seconds — a real Linux machine per agent, with checkpoints, forks, and preview URLs. agentsh would add the governance layer (command rules, network policy, file-I/O policy, secrets redaction, audit logging) on top.

+---------------------------------------------------------+
|  OpenComputer VM (real kernel, persistent)              |
|  +---------------------------------------------------+  |
|  |  agentsh (governance layer — partial on OC)       |  |
|  |  +---------------------------------------------+  |  |
|  |  |  AI Agent                                   |  |  |
|  |  |  - Static policy: ✓                         |  |  |
|  |  |  - File rules (FUSE + landlock): ✓          |  |  |
|  |  |  - Command rules: ✗ (shim bypassed)         |  |  |
|  |  |  - Network rules: ✗ (shim bypassed)         |  |  |
|  |  |  - Audit logging: ✓ (for what reaches it)   |  |  |
|  |  +---------------------------------------------+  |  |
|  +---------------------------------------------------+  |
+---------------------------------------------------------+

Quick Start

Prerequisites

  • Node 18+
  • OPENCOMPUTER_API_KEY in .env (see .env.example)

Build

cd ts
npm install
set -a; source ../.env; set +a
npm run build

Build phase (≈3 min on first run, ≈30 s on cached image):

  1. Image.base() → apt-install ca-certs/curl/jq/libseccomp2/sudo/fuse3/python3
  2. Download + dpkg -i agentsh v0.19.0 .deb from canyonroad/agentsh GitHub releases
  3. Copy default.yaml, config.yaml, agentsh-startup.sh into the image
  4. Plant /etc/sudoers.d/agentsh (passwordless sudo for agentsh + chmod + tee)
  5. Sandbox.create({ image }) → run startup script → wait for agentsh server health
  6. sandbox.createCheckpoint('oc-agentsh') → poll status === 'ready'
  7. Sandbox.createCheckpointPatch(id, { script }) → patch runs agentsh-startup.sh on every fork

The build ends by printing the checkpoint id. Export it to skip the slow lookup in subsequent runs:

export OC_AGENTSH_CHECKPOINT_ID=<uuid-from-build>

Run the test suite

npm test

Expected: 27 passed, 0 failed. The suite covers installation, server health, shell shim install, static policy evaluation (10 cases), and capability detection.

How it works

+-------------------------------------------------------------+
|  Build phase (one-time)                                     |
|                                                             |
|  ts/image.ts          declarative Image manifest            |
|        |                                                    |
|        v                                                    |
|  ts/build.prod.ts     Sandbox.create({ image })             |
|        |              -> exec.run agentsh-startup.sh        |
|        |              -> sandbox.createCheckpoint('oc-...') |
|        |              -> wait status === 'ready'            |
|        |              -> Sandbox.createCheckpointPatch(...) |
|        v                                                    |
|  oc-agentsh checkpoint  +  on-fork patch                    |
+-------------------------------------------------------------+
                     |
                     v
+-------------------------------------------------------------+
|  Demo phase                                                 |
|                                                             |
|  Sandbox.createFromCheckpoint(<id>)                         |
|     -> patch runs agentsh-startup.sh on the new fork        |
|     -> agentsh server up on 127.0.0.1:18080                 |
|     -> shell shim installed (/bin/bash -> agentsh shim)     |
|     -> ready in ~5s                                         |
+-------------------------------------------------------------+

OC's Image builder has no startup-command primitive (no setStartCmd/CMD/systemd hooks). The natural OC primitive for "run something on every fresh fork" is a patch attached to a checkpoint. The patch body is agentsh-startup.sh, which is also the same script the build phase runs once before snapshotting — kept idempotent so it's safe to run twice.

Hibernation preserves full VM memory state, so once agentsh is running, hibernate→wake is transparent at the agentsh layer.

Project structure

agentsh-opencomputer/
├── README.md                      this file
├── LICENSE                        MIT
├── default.yaml                   agentsh policy rules (workspace, network, command, env)
├── config.yaml                    agentsh server config (FUSE, seccomp, ptrace, cgroups, ...)
├── agentsh-startup.sh             idempotent — runs once during build AND as patch on every fork
│
└── ts/
    ├── package.json               @opencomputer/sdk + tsx + dotenv
    ├── tsconfig.json
    ├── image.ts                   declarative Image manifest
    ├── lib.ts                     findCheckpointId(name) — the SDK identifies by UUID, not alias
    ├── build.prod.ts              build orchestrator
    ├── test-checkpoint.ts         27-test suite (5 categories — see below)
    └── .api-notes.md              SDK v0.5.16 surface notes captured during build

Test coverage

=== Installation ===
  agentsh installed                              ✓
  seccomp support (symbols present)              ✓

=== Server & Configuration ===
  server healthy                                 ✓
  server process running                         ✓
  policy file exists                             ✓
  config file exists                             ✓
  FUSE eager-enabled in config                   ✓
  seccomp enabled in config                      ✓

=== Shell Shim ===
  shim installed (/bin/bash statically linked)   ✓
  real bash preserved (/bin/bash.real)           ✓
  echo through shim                              ✓
  Python through shim                            ✓

=== Policy Evaluation (static) ===
  policy-test: sudo denied                       ✓
  policy-test: echo allowed                      ✓
  policy-test: workspace write allowed           ✓
  policy-test: workspace read allowed            ✓
  policy-test: tmp write allowed                 ✓
  policy-test: workspace delete is soft-delete   ✓
  policy-test: SSH key access requires approval  ✓
  policy-test: AWS credentials require approval  ✓
  policy-test: system path write denied          ✓
  policy-test: /etc write denied                 ✓

=== Security Diagnostics ===
  detect: seccomp_basic available                ✓
  detect: landlock available                     ✓
  detect: cgroups_v2 available (OC uplift)       ✓
  detect: ebpf unavailable (covered by landlock) ✓
  detect: protection score                       ✓ 100/100

agentsh debug policy-test evaluates a hypothetical operation against the loaded policy and prints a JSON decision — these tests validate that the policy is loaded and reasoning correctly. They do NOT validate that runtime enforcement actually fires — see Findings.

Findings

1. agentsh on OC scores 100/100 on agentsh detect — but the score is partly cosmetic

Full breakdown:

FILE PROTECTION       25/25  fuse ✓  landlock ✓ (ABI v5)  seccomp-notify ✓
COMMAND CONTROL       25/25  seccomp-execve ✓  ptrace ✓
NETWORK               20/20  ebpf ✗  landlock-network ✓
RESOURCE LIMITS       15/15  cgroups-v2 ✓ (nested, delegated)
ISOLATION             15/15  capability-drop ✓ (41/41)  pid-namespace ✗

The headline win vs. E2B (which scored 85/100) is cgroups-v2 — OC delegates a writable cgroup subtree, while E2B doesn't. That accounts for +15 points.

Caveats:

  • cgroups-v2 ✓ reports a delegated cgroup, but agentsh's per-session mkdir /sys/fs/cgroup/agentsh-session-... fails with permission denied. So per-command resource isolation isn't actually working — the cgroups score reflects "feature available" not "feature applied."
  • ebpf ✗: OC does not grant CAP_BPF/CAP_SYS_ADMIN. The NETWORK category still scores 20/20 because landlock-network covers it, but eBPF-based enforcement is unavailable.
  • pid-namespace ✗: host PID namespace. capability-drop carries the ISOLATION category; PID isolation is not enforced.

2. Runtime command/network enforcement does not engage on OC

This is the substantive gap. agentsh has TWO non-shim enforcement paths it can use — and both have issues on OC.

a. Shell shim path (default)

  • During build, agentsh shim install-shell replaces /bin/bash with a static shim binary.
  • At runtime, every shell-spawned command goes through /bin/bash → shim → agentsh policy → real bash.

This works on E2B because E2B's exec API invokes /bin/bash. On OC it doesn't:

Path Result on OC
sbx.exec.run('sudo whoami') Returns root, exit 0. Shim never fires — OC daemon execs sudo directly.
sbx.exec.run('curl https://evil.com/') Returns 200. Network policy bypassed.
sbx.exec.run('/bin/bash -c "sudo whoami"') (with AGENTSH_SHIM_FORCE=1) Blocked with rule=shellc-opaque-script exit 126. But this rule blocks ALL /bin/bash -c invocations, not just denied ones — so allowed commands also fail. Not a usable enforcement path.
agentsh /api/v1/sessions/.../exec (session API) agentsh-unixwrap exec.LookPath cannot find bare command names: exec "echo": executable file not found in $PATH. Absolute paths trip landlock-execve with permission denied. Server returns empty for /usr/bin/echo. Side-event: cgroup_apply_failed: mkdir /sys/fs/cgroup/...: permission denied on every command.

b. ptrace path (the natural fix — but unimplemented for the case we need)

agentsh has a ptrace-based enforcement path for hosts where the shim isn't in the command path. This is what E2B ALSO uses (see e2b-agentsh/config.yaml's ptrace.enabled: true block). It works by ptrace-attaching to processes and intercepting execve/connect/bind syscalls at the kernel level — independent of any shell.

agentsh supports two attach modes:

  • attach_mode: "children" — attaches to processes spawned as children of the agentsh server. Works on E2B because the e2b-orchestrator routes commands through paths that descend from agentsh's process tree.
  • attach_mode: "pid" with target_pid: N — attaches to a specific PID and uses PTRACE_O_TRACEFORK to catch all its descendants. Documented in internal/config/ptrace.go.

On OC, "children" mode is empty because:

PID 1: osb-agent (the OC daemon) — spawns ALL commands from sbx.exec.run
  └── PID 739: agentsh server — has NO children spawned via the SDK

sbx.exec.run commands become children of osb-agent (PID 1), not children of agentsh. So attach_mode: "children" sees nothing.

attach_mode: "pid" with target_pid: 1 would work — agentsh would attach to osb-agent and catch every command via TRACEFORK. Plus Yama ptrace_scope=1 is no obstacle if agentsh runs as root (CAP_SYS_PTRACE bypasses Yama).

But: attach_mode: "pid" is parsed and validated, and the agentsh server logs ptrace tracer started attach_mode=pid — yet tr.AttachPID(targetPID) is never called. Reading internal/api/app_ptrace_linux.go:initPtraceTracer():

go func() { tr.Run(ctx) }()  // Tracer starts...
slog.Info("ptrace tracer started", "attach_mode", cfg.AttachMode)
// ↑ The function returns here. cfg.TargetPID was parsed from YAML
// but the tracer is never told to attach to it.

AttachPID is invoked from exec_ptrace_linux.go:48 and wrap_linux.go:316 — both during agentsh's own command-wrap path. Neither fires from the attach_mode: "pid" config. Tests that exercise pid-mode attach (internal/ptrace/integration_test.go:1098, 1158, ...) call tr.AttachPID(pid) manually from test code.

This is the actionable agentsh bug: the target_pid config field exists but its runtime path is missing. A ~15-line fix in initPtraceTracer() to call tr.AttachPID(cfg.TargetPID) after the tracer starts (with TargetPIDFile support if non-empty) would close the gap on OC and on any other shim-bypassing host.

3. SDK API surface notes (recorded in ts/.api-notes.md)

Several differences from what the demo plan initially assumed:

Plan assumed Actual (SDK v0.5.16)
import { Image } from '@opencomputer/sdk' Must be @opencomputer/sdk/node (the runtime Image class lives only there).
Image() factory Image.base() (constructor is private).
runCommands([cmds]) array param runCommands(...cmds) rest param.
Sandbox.attachPatch(...) Sandbox.createCheckpointPatch(checkpointId, { script, description }) (static, no strategy).
Sandbox.createFromCheckpoint(alias) Takes a UUID, not the user-friendly name. Need listCheckpoints lookup.
sbx.createPreviewURL returns { url } Returns { hostname, port, ... } — caller composes the URL.
OPENCOMPUTER_API_KEY env Confirmed. Optional OPENCOMPUTER_API_URL defaults to https://app.opencomputer.dev.
Image-build runCommands run as root (Docker-style) Run as non-root. apt-install runs as root, runCommands does not. Need sudo in every privileged command.
Runtime sandbox is "real Linux machine with root access" Runtime is non-root too. Sudo is available passwordlessly when configured via /etc/sudoers.d/.
sandbox.listCheckpoints() returns CheckpointInfo[] Returns { checkpoints: CheckpointInfo[] } despite the SDK's type annotation.
createCheckpoint returns when checkpoint is usable Returns before S3 upload completes. Must poll status === 'ready' before createCheckpointPatch (otherwise 400 "checkpoint is not ready").

These divergences are documented inline in commits and in ts/.api-notes.md. They will affect any future OC SDK consumer; worth raising upstream.

Recommendations

For the agentsh team

  1. Wire attach_mode: "pid" to actually attach — the highest-leverage fix. internal/config/ptrace.go parses target_pid and target_pid_file and validates them. internal/api/app_ptrace_linux.go:initPtraceTracer() starts the tracer but never calls tr.AttachPID(cfg.TargetPID). The minimal patch — read cfg.TargetPID (or cfg.TargetPIDFile), wait for the tracer to be ready, call tr.AttachPID(targetPID), log success/failure, fail-closed per cfg.OnAttachFailure. Once that lands, target_pid: 1 + attach_mode: "pid" + agentsh-as-root works on OC and on any other host where commands are spawned by a non-agentsh ancestor (Daytona, generic Docker exec, etc.).
  2. Soften shellc-opaque-script so simple bash -c "..." invocations are analyzable. Currently any -c invocation trips a blanket deny when AGENTSH_SHIM_FORCE=1 is set. The shim already has a shell parser (internal/shellparse); using it for -c content would let allowed commands through and only deny truly opaque scripts (eval, $(...) chains, etc.). This unblocks the sbx.exec.run('/bin/bash -c "..."') workaround as a stopgap until #1 lands.
  3. agentsh-unixwrap's exec.LookPath failure inside the session API is reproducible on OC with the same image that works on E2B. The wrapper at cmd/agentsh-unixwrap/main.go:220 calls exec.LookPath(cmd) which uses os.Getenv("PATH") — on OC the wrapper's runtime PATH appears empty or /usr/bin traversal is restricted. Worth a few hours of investigation.

For the OpenComputer team

  1. sbx.exec.run should optionally route through /bin/bash. A RunOpts.shell?: boolean flag would let consumers opt into shell semantics and engage installed shims. This unblocks every shim-based runtime governance tool, not just agentsh.
  2. Consider documenting the non-root reality. Several pages in https://docs.opencomputer.dev call OC sandboxes "real Linux machine with root access," but both image-build runCommands and the runtime user are non-root by default. Sudo is available, but it needs /etc/sudoers.d/ setup. Worth a note.
  3. createCheckpoint returning before S3 upload completes is surprising. Either delay the resolution until status is ready, or document the readiness poll prominently.
  4. listCheckpoints() returns { checkpoints: [...] } but the SDK types it as CheckpointInfo[]. Minor SDK type bug.
  5. pid-namespace and CAP_BPF/CAP_SYS_ADMIN are unavailable in the runtime sandbox. The first matters for process isolation; the second for eBPF-based network and tracing tools. Both are low-impact for the typical agent workload but worth flagging in a "what isn't available" section of the docs.

Sibling demos

This repo intentionally mirrors the structure of:

  • e2b-agentsh — agentsh on E2B (TypeScript, 76+ tests, full runtime enforcement working)
  • daytona-test — agentsh on Daytona (Python, snapshot-based, 30+ tests)

The 27-test subset here is the portion of the e2b-agentsh suite that doesn't depend on shim-based runtime enforcement. The e2b version's command-blocking, network-blocking, and multi-context tests are intentionally not ported.

License

MIT

About

agentsh + OpenComputer demo: build pipeline, capability detection, and reproducer for canyonroad/agentsh PR #269

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors