Skip to content

v0.1.1: interactive exec (PTY) for kubectl-exec parity #1

@CMGS

Description

@CMGS

Goal

Make kubectl exec -it pod -- vim (or top, htop, an interactive shell, etc.) work end-to-end against a Cocoon-managed VM. The current pipe-only path can't run TUI programs because they require a real TTY (isatty(1), ioctl(TIOCGWINSZ), tcsetattr, SIGWINCH, controlling terminal for signals).

This issue tracks the cocoon-agent v0.1.1 work. Downstream patches in cocoonv2 and vk-cocoon are in scope as follow-ups (see §6).

Linux only. Windows guests stay on RDP (already stubbed in vk-cocoon) — see windows discussion below.

Non-goals

  • Windows guest support — out of scope. Windows would require ConPTY + virtio-vsock-with-driver-or-virtio-serial + Windows Service registration + image-builder changes (~3-5 weeks). Existing RDP fallback in vk-cocoon is acceptable for now.
  • Binary wire format. JSON+base64 has ~33% overhead but is fine for v1; revisit if a TUI bandwidth complaint surfaces.
  • Out-of-band signal forwarding. With PTY mode, raw-mode bytes flow straight through to the agent's PTY master; the kernel's termios ISIG translates 0x03 into SIGINT to the foreground pgid. Ctrl-C and Ctrl-Z work for free.

1. Wire protocol additions (backward compatible)

agent/protocol.go:

const MsgWinResize = ""winresize""  // NEW: client → agent during PTY session

type Message struct {
    // existing fields ...
    Tty  bool   `json:""tty,omitempty""`   // MsgExec: opt into PTY mode
    Rows uint16 `json:""rows,omitempty""`  // MsgExec init / MsgWinResize payload
    Cols uint16 `json:""cols,omitempty""`  // MsgExec init / MsgWinResize payload
}

A MsgExec without Tty continues to take the existing pipe path; pre-v0.1.1 clients keep working unchanged.

2. Agent server: split runners by mode

New file agent/pty_unix.go (build tag !windows) with runExecPTY:

  • pty.StartWithSize(cmd, &pty.Winsize{Rows, Cols}) (creack/pty) sets up Setsid + Setctty and wires the slave fd to child stdin/stdout/stderr. Do not call the existing setProcessGroup — it would conflict with the session/ctty setup the pty library already performs.
  • One goroutine: io.Copy(framedWriter{MsgStdout}, master). PTY mode never emits MsgStderr (slave merges both onto a single fd, by design).
  • One goroutine drains the inbound frame channel and dispatches by type:
    • MsgStdinmaster.Write(frame.Data)
    • MsgStdinClose → no-op (closing master would deliver SIGHUP; an interactive client doesn't have stdin-EOF semantics)
    • MsgWinResizepty.Setsize(master, ...); the kernel auto-delivers SIGWINCH to the foreground pgid.
  • On cmd.Wait(): cancel ctx, close master (drains the read pump), wait for I/O goroutines, send MsgExit.

agent/pty_other.go (windows): stub returning MsgError ""pty not supported on this platform"".

agent/agent.go handleConn changes:

  • Decoder goroutine no longer stops on MsgStdinClose; it forwards every frame until EOF. Stop-on-stdin-close logic moves into the pipe runner's pumpStdin (where it already lives — just don't duplicate it in the decoder).
  • After the initial MsgExec, dispatch:
if first.Tty {
    runErr = runExecPTY(execCtx, first.Argv, first.Env, first.Rows, first.Cols, stdinFrames, enc, &encMu)
} else {
    runErr = runExec(execCtx, first.Argv, first.Env, stdinFrames, enc, &encMu)
}

3. Client library: additive API

client/client.go:

type TermSize struct{ Rows, Cols uint16 }

type RunOpts struct {
    Tty         bool
    InitialRows uint16
    InitialCols uint16
    Resize      <-chan TermSize  // optional, caller-owned
}

// Run is unchanged in signature; calls RunWithOpts with zero opts.
func Run(...) (int, error) { return RunWithOpts(..., RunOpts{}) }

func RunWithOpts(ctx, conn, argv, env, stdin, stdout, stderr, opts) (int, error)

Implementation notes:

  • Send MsgExec{Tty, Rows, Cols, ...} as the first frame.
  • If opts.Tty && opts.Resize != nil, spawn a goroutine that forwards each TermSize event as a MsgWinResize frame.
  • The encoder gains two writers post-handshake (stdin pump + winresize forwarder). Add encMu *sync.Mutex to serialize Encode calls — small lock, no real cost, matches the agent-side pattern.
  • In TTY mode, stderr io.Writer is accepted but never written (the agent never emits MsgStderr). Document this clearly in the godoc.

New file client/pty_unix.go (build tag linux || darwin) — convenience helper for ""I have a local terminal fd"":

func RunInteractive(ctx context.Context, conn io.ReadWriteCloser,
    argv []string, env map[string]string, tty, out *os.File) (int, error)

Responsibilities of RunInteractive:

  1. term.IsTerminal(fd) validation.
  2. term.GetSize(fd) for initial rows/cols.
  3. term.MakeRaw(fd) + defer term.Restore(fd, state).
  4. signal.Notify(SIGWINCH) + goroutine that reads new size and pushes to a local Resize channel.
  5. Call RunWithOpts with Tty: true, InitialRows, InitialCols, Resize: localCh.

client/pty_other.go (windows): stub returning a clear ""linux/darwin only"" error.

4. Agent's debug client (cmd/client.go)

Add --tty/-t flag and branch:

if tty {
    exitCode, err = client.RunInteractive(ctx, conn, args, nil, os.Stdin, os.Stdout)
} else {
    exitCode, err = client.Run(ctx, conn, args, nil, os.Stdin, os.Stdout, os.Stderr)
}

Smoke test once shipped: cocoon-agent client -t --cid 3 -- vim.

5. Tests

agent/agent_test.go — new tests over loopback TCP (creack/pty opens local /dev/ptmx regardless of transport, so loopback works fine):

  • TestServerPTYIsATTYsh -c 'tty -s && echo TTY || echo NOTTY'
  • TestServerPTYWinResizesh -c 'stty size; trap ""stty size; exit 0"" WINCH; while true; do sleep 0.05; done'
  • TestServerPTYStdinEchocat round-trip (cooked-mode echo verifies bidirectional flow)
  • TestServerPTYExitCodesh -c 'exit 5' with Tty: true

agent/protocol_test.go — round-trip cases for MsgExec with new fields and a MsgWinResize frame.

Coverage target: keep agent package ≥ 88% (current).

6. Downstream follow-ups (separate releases)

6.1 cocoonv2 (~30 lines, ~0.75 day)

cmd/vm/commands.go:

execCmd.Flags().BoolP(""tty"", ""t"", false, ""allocate a TTY (vim/top/interactive shells)"")
execCmd.Flags().BoolP(""stdin"", ""i"", false, ""keep stdin open (kubectl parity)"")

cmd/vm/exec.go — branch on --tty:

if tty {
    if !term.IsTerminal(int(os.Stdin.Fd())) {
        return errors.New(""exec: -t requires stdin to be a terminal"")
    }
    code, err = client.RunInteractive(ctx, conn, argv, env, os.Stdin, os.Stdout)
} else {
    code, err = client.Run(ctx, conn, argv, env, stdinR, os.Stdout, os.Stderr)
}

Bump cocoon-agent dep to v0.1.1, add golang.org/x/term. Add a unit test for ""--tty errors when stdin is not a terminal"".

6.2 vk-cocoon (~80 lines, ~1.5 days)

Plumb attach.TTY() and attach.Resize() through Runtime.Exec (interface signature widens with an ExecOpts{Tty, Resize} struct). Provider:

return p.Runtime.Exec(ctx, v.ID, cmd, nil,
    attach.Stdin(), attach.Stdout(), attach.Stderr(),
    vm.ExecOpts{Tty: attach.TTY(), Resize: attach.Resize()})

In cocoon_cli.go, when Tty: true, allocate a local PTY around the cocoon vm exec -t -i ... shell-out (pty.Start(cmd)); copy stdin/stdout via the master; forward attach.Resize() events to pty.Setsize(master, ...). The kernel sends SIGWINCH to the cocoon CLI process, which then re-reads its size and forwards MsgWinResize over vsock to the agent. Three-layer PTY chain (vk → CLI → agent), each layer is standard unix behavior.

This keeps the rest of vk-cocoon's shell-out runtime intact. A direct cocoon-agent/client import would be cleaner long-term but requires moving dialHybridVsock into a public package and is a separate refactor.

7. Release order

  1. cocoon-agent PR → review → merge → tag v0.1.1 → goreleaser. Verify with cocoon-agent client -t --cid X -- vim.
  2. cocoonv2 patch (bumps cocoon-agent dep). Verify with cocoon vm exec -t <vm> -- vim.
  3. vk-cocoon patch. Verify with kubectl exec -it pod -- vim against a real cluster.

If step 3 fails, steps 1-2 have already independently validated the PTY chain, so the bug is isolated to the vk-cocoon PTY-shell-out wrapper.

8. Open questions

  1. Resize channel ownership — caller-owned (lib does not close). Confirm.
  2. Stderr writer in TTY mode — accepted but ignored, with godoc note. Confirm.
  3. cocoon vm exec -t against non-TTY stdin — error out (kubectl style), do not silently degrade. Confirm.
  4. vk-cocoon path — local-PTY-around-shell-out (recommended, ~80 lines) vs direct lib import (~200 lines + cross-repo refactor). Confirm.

9. Effort estimate

Project Coding Tests Verify+docs Total
cocoon-agent v0.1.1 1d 0.5d 0.5d 2d
cocoonv2 patch 0.25d 0.25d 0.25d 0.75d
vk-cocoon patch 0.5d 0.5d 0.5d 1.5d

Total: ~4.5 engineer-days (excluding cross-repo review wait time).

10. Dependencies

  • github.com/creack/pty — server-side PTY allocation (no cgo)
  • golang.org/x/term — client-side raw mode + GetSize

11. Acceptance criteria

  • cocoon-agent client -t --cid X -- vim opens vim, ESC + :q quits cleanly, terminal is restored.
  • cocoon-agent client -t --cid X -- top shows live updates, resizing the host terminal redraws correctly.
  • make lint (linux + darwin) zero issues; make test -race passes; agent package coverage ≥ 88%.
  • MsgExec without Tty still takes the pipe path (regression test for the existing exec session covers this).
  • Released as v0.1.1 with goreleaser artifacts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions