Summary
Port agent/skyhook-agent/src/skyhook_agent/chroot_exec.py to Go as a subcommand of the agent binary (agent chroot-exec <control-file> <chroot-dir>). Today this Python script is invoked as a child process by the agent's main controller — keeping it as a subcommand of the same Go binary preserves that two-process model (so the chroot only persists for the duration of the step) without shipping two binaries.
Depends on #213 (the cmd/agent skeleton).
Motivation
The chroot is intentionally per-step, not per-agent: the parent process reads config.json and orchestrates retries / interrupts in the container's normal filesystem; the child process chroots into the host root and exec's the step. If the agent itself chrooted, every subsequent step (and the schema files embedded in the binary, and the /etc/resolv.conf already copied over) would have to be re-set-up in the host root. The current split is the right design — we just need the same split in Go.
This PR also extracts the env-merging logic into a pure function that #217 (the runner) can unit-test without actually chrooting.
Feature description
Add a chroot-exec subcommand to cmd/agent that:
- Reads a JSON control file at
argv[1] containing {cmd, no_chmod, env, copy_dir}.
- Captures the parent process's env (the container env).
- If
chroot_dir != "local", calls syscall.Chroot(chrootDir) and os.Chdir("/").
- Optionally
chmod +x on cmd[0] (Python adds S_IXGRP|S_IXUSR|S_IXOTH).
- Reads the chroot's env via
subprocess.run(["env"]) (in Go: exec.Command("env").Output()) and merges in the order container_env ← chroot_env ← skyhook_env.
- Validates
copy_dir is absolute, then exec's cmd with Dir: copy_dir and the merged env.
Proposed direction
1. Subcommand wiring
agent chroot-exec <control-file> <chroot-dir>
Use the subcommand router #213 picked. No new dep if flag is already enough.
2. Env merge — pure function
In internal/chroot/env.go:
func ProcessEnv(containerEnv, skyhookEnv, chrootEnv map[string]string) map[string]string
Order matters and must match Python _get_process_env:
- Start from
containerEnv.
- Overwrite with
chrootEnv (so things like PATH resolve against the host).
- Overwrite with
skyhookEnv (so package-author env wins last).
This function is unit-tested without any chroot. The chroot subcommand main is just glue.
3. Chroot mechanics
syscall.Chroot requires CAP_SYS_CHROOT. Agent already runs as root in the container (see USER 0:0 in containers/agent.Dockerfile). Document the requirement in a // why: comment.
- After
Chroot, immediately Chdir("/") (matches Python).
- The
"local" sentinel skips the chroot entirely — used by tests and as a debug escape hatch. Preserve it.
4. Control file
JSON shape (matches Python):
{
"cmd": ["/path/to/step.sh", "arg1"],
"no_chmod": false,
"env": {"FOO": "bar"},
"copy_dir": "/etc/skyhook/.../skyhook_dir"
}
The control file is written by the parent (#217) to a tempfile.NamedTemporaryFile and deleted on parent return. Do not change the file format — #217 will write it.
5. Tests
Port agent/skyhook-agent/tests/test_chroot_exec.py:
ProcessEnv precedence test (table-driven, no I/O).
- Control file parse error test.
copy_dir not absolute → error.
no_chmod=true skips the chmod call (mock the chmod function).
- "local" chroot mode: no
syscall.Chroot call, env merged, command exec'd in copy_dir.
The actual syscall.Chroot path is not unit-testable without root + a real filesystem. Cover it in #221's chainsaw e2e tests against a real kind cluster.
Scope boundaries
In scope:
- The
chroot-exec subcommand and its env-merging helper.
Out of scope:
Acceptance criteria
agent chroot-exec --help documents the two positional args.
- Env merge precedence matches Python on all permutations.
- Absolute-path validation on
copy_dir matches Python.
- "local" chroot mode runs end-to-end in tests without root.
- The Python tests' assertions all have a Go equivalent.
- Binary stays single-artifact — no new executables shipped.
Open questions
- Should we use
unix.Chroot from golang.org/x/sys/unix (already in operator vendor) or stdlib syscall.Chroot? Both work on Linux; pick whichever lives in fewer imports.
- The Python version captures
os.environ before chroot. Go's os.Environ() does the same — confirm via test that env captured pre-chroot still propagates post-chroot.
- Cross-platform: the Python agent only runs on Linux (Distroless). Should we
//go:build linux the chroot file or add a stub for darwin to keep go test working on developer Macs? Recommend //go:build linux + a darwin stub that returns errors.New("not supported").
References (codebase)
Alternatives considered
- Ship
chroot-exec as a standalone binary (mirroring Python's two-script layout). Rejected — single Go binary is simpler to ship and version.
- Re-exec the agent binary with a special env var instead of a subcommand. Rejected — explicit subcommand is more discoverable and easier to test.
Code of Conduct
Summary
Port agent/skyhook-agent/src/skyhook_agent/chroot_exec.py to Go as a subcommand of the agent binary (
agent chroot-exec <control-file> <chroot-dir>). Today this Python script is invoked as a child process by the agent's main controller — keeping it as a subcommand of the same Go binary preserves that two-process model (so the chroot only persists for the duration of the step) without shipping two binaries.Depends on #213 (the
cmd/agentskeleton).Motivation
The chroot is intentionally per-step, not per-agent: the parent process reads
config.jsonand orchestrates retries / interrupts in the container's normal filesystem; the child process chroots into the host root and exec's the step. If the agent itself chrooted, every subsequent step (and the schema files embedded in the binary, and the/etc/resolv.confalready copied over) would have to be re-set-up in the host root. The current split is the right design — we just need the same split in Go.This PR also extracts the env-merging logic into a pure function that #217 (the runner) can unit-test without actually chrooting.
Feature description
Add a
chroot-execsubcommand tocmd/agentthat:argv[1]containing{cmd, no_chmod, env, copy_dir}.chroot_dir != "local", callssyscall.Chroot(chrootDir)andos.Chdir("/").chmod +xoncmd[0](Python addsS_IXGRP|S_IXUSR|S_IXOTH).subprocess.run(["env"])(in Go:exec.Command("env").Output()) and merges in the ordercontainer_env ← chroot_env ← skyhook_env.copy_diris absolute, then exec'scmdwithDir: copy_dirand the merged env.Proposed direction
1. Subcommand wiring
Use the subcommand router #213 picked. No new dep if
flagis already enough.2. Env merge — pure function
In
internal/chroot/env.go:Order matters and must match Python
_get_process_env:containerEnv.chrootEnv(so things likePATHresolve against the host).skyhookEnv(so package-author env wins last).This function is unit-tested without any chroot. The chroot subcommand
mainis just glue.3. Chroot mechanics
syscall.Chrootrequires CAP_SYS_CHROOT. Agent already runs as root in the container (seeUSER 0:0in containers/agent.Dockerfile). Document the requirement in a// why:comment.Chroot, immediatelyChdir("/")(matches Python)."local"sentinel skips the chroot entirely — used by tests and as a debug escape hatch. Preserve it.4. Control file
JSON shape (matches Python):
{ "cmd": ["/path/to/step.sh", "arg1"], "no_chmod": false, "env": {"FOO": "bar"}, "copy_dir": "/etc/skyhook/.../skyhook_dir" }The control file is written by the parent (#217) to a
tempfile.NamedTemporaryFileand deleted on parent return. Do not change the file format — #217 will write it.5. Tests
Port agent/skyhook-agent/tests/test_chroot_exec.py:
ProcessEnvprecedence test (table-driven, no I/O).copy_dirnot absolute → error.no_chmod=trueskips the chmod call (mock the chmod function).syscall.Chrootcall, env merged, command exec'd incopy_dir.The actual
syscall.Chrootpath is not unit-testable without root + a real filesystem. Cover it in #221's chainsaw e2e tests against a real kind cluster.Scope boundaries
In scope:
chroot-execsubcommand and its env-merging helper.Out of scope:
teestreaming andrun_step#217).teestreaming andrun_step#217).Acceptance criteria
agent chroot-exec --helpdocuments the two positional args.copy_dirmatches Python.Open questions
unix.Chrootfromgolang.org/x/sys/unix(already in operator vendor) or stdlibsyscall.Chroot? Both work on Linux; pick whichever lives in fewer imports.os.environbefore chroot. Go'sos.Environ()does the same — confirm via test that env captured pre-chroot still propagates post-chroot.//go:build linuxthe chroot file or add a stub for darwin to keepgo testworking on developer Macs? Recommend//go:build linux+ adarwinstub that returnserrors.New("not supported").References (codebase)
Alternatives considered
chroot-execas a standalone binary (mirroring Python's two-script layout). Rejected — single Go binary is simpler to ship and version.Code of Conduct