From e3fdc90ed5b3de2db39b786ee779918922762919 Mon Sep 17 00:00:00 2001 From: JohnnyVicious Date: Mon, 13 Apr 2026 07:43:08 +0200 Subject: [PATCH 1/3] fix(rescue): inline forwarding into the slash command (closes #17) The /opencode:rescue --background path was broken because plugin-scoped subagents cannot use permissionMode: bypassPermissions, and Claude Code does not propagate the command's allowed-tools grant to a subagent it spawns. Background-mode subagents have no way to prompt for Bash permission, so the forwarder was auto-denied and returned an error instead of OpenCode's output. Eliminate the subagent layer entirely. The command's own context already has Bash(node:*), so commands/rescue.md now invokes opencode-companion.mjs task ... directly. --background maps 1:1 to the companion's existing detach-a-worker mode, which removes the ambiguity between "Claude backgrounds the subagent" and "the companion detaches a worker" that has plagued every related issue (#16, #37). The command keeps context: fork so OpenCode stdout still lands in a forked context rather than the parent thread. Deletes: - plugins/opencode/agents/opencode-rescue.md (sole-purpose forwarder) - plugins/opencode/skills/opencode-runtime/ (only used by the deleted subagent; marked user-invocable: false) README.md updated to drop the subagent reference from the slash-command list and the project structure tree. All 172 tests pass. --- README.md | 3 +- plugins/opencode/agents/opencode-rescue.md | 45 ----------------- plugins/opencode/commands/rescue.md | 48 +++++++++---------- .../opencode/skills/opencode-runtime/SKILL.md | 42 ---------------- 4 files changed, 25 insertions(+), 113 deletions(-) delete mode 100644 plugins/opencode/agents/opencode-rescue.md delete mode 100644 plugins/opencode/skills/opencode-runtime/SKILL.md diff --git a/README.md b/README.md index d48ff80..a8130fa 100644 --- a/README.md +++ b/README.md @@ -99,7 +99,7 @@ To check your configured providers: - `/opencode:review` -- Normal OpenCode code review (read-only). Supports `--base `, `--pr `, `--model `, `--free`, `--wait`, and `--background`. - `/opencode:adversarial-review` -- Steerable review that challenges implementation and design decisions. Supports `--base `, `--pr `, `--model `, `--free`, `--wait`, `--background`, and custom focus text. -- `/opencode:rescue` -- Delegates a task to OpenCode via the `opencode:opencode-rescue` subagent. Supports `--model`, `--free`, `--agent`, `--resume`, `--fresh`, `--worktree`, `--wait`, and `--background`. +- `/opencode:rescue` -- Delegates a task to OpenCode via a direct call to the companion's `task` runtime. Supports `--model`, `--free`, `--agent`, `--resume`, `--fresh`, `--worktree`, and `--background`. Foreground is the default; `--background` detaches a worker and returns a job id you can poll with `/opencode:status`. - `/opencode:status` -- Shows running/recent OpenCode jobs for the current repo. - `/opencode:result` -- Shows final output for a finished job, including OpenCode session ID for resuming. - `/opencode:cancel` -- Cancels an active OpenCode job. @@ -185,7 +185,6 @@ opencode-plugin-cc/ ├── .github/workflows/ci.yml # GitHub Actions CI ├── plugins/opencode/ │ ├── .claude-plugin/plugin.json # Plugin metadata -│ ├── agents/opencode-rescue.md # Rescue subagent definition │ ├── commands/ # 7 slash commands │ │ ├── review.md │ │ ├── adversarial-review.md diff --git a/plugins/opencode/agents/opencode-rescue.md b/plugins/opencode/agents/opencode-rescue.md deleted file mode 100644 index a4701a2..0000000 --- a/plugins/opencode/agents/opencode-rescue.md +++ /dev/null @@ -1,45 +0,0 @@ ---- -name: opencode-rescue -description: Proactively use when Claude Code is stuck, wants a second implementation or diagnosis pass, needs a deeper root-cause investigation, or should hand a substantial coding task to OpenCode through the shared runtime -model: sonnet -tools: Bash -skills: - - opencode-runtime - - opencode-prompting ---- - -You are a thin forwarding wrapper around the OpenCode companion task runtime. - -Your only job is to forward the user's rescue request to the OpenCode companion script. Do not do anything else. - -Selection guidance: - -- Do not wait for the user to explicitly ask for OpenCode. Use this subagent proactively when the main Claude thread should hand a substantial debugging or implementation task to OpenCode. -- Do not grab simple asks that the main Claude thread can finish quickly on its own. - -Forwarding rules: - -- Use exactly one `Bash` call to invoke `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ...`. -- If the user did not explicitly choose `--background` or `--wait`, prefer foreground for a small, clearly bounded rescue request. -- If the user did not explicitly choose `--background` or `--wait` and the task looks complicated, open-ended, multi-step, or likely to keep OpenCode running for a long time, prefer background execution. -- You may use the `opencode-prompting` skill only to tighten the user's request into a better OpenCode prompt before forwarding it. -- Do not use that skill to inspect the repository, reason through the problem yourself, draft a solution, or do any independent work beyond shaping the forwarded prompt text. -- Do not inspect the repository, read files, grep, monitor progress, poll status, fetch results, cancel jobs, summarize output, or do any follow-up work of your own. -- Do not call `review`, `adversarial-review`, `status`, `result`, or `cancel`. This subagent only forwards to `task`. -- Leave `--agent` unset unless the user explicitly requests a specific agent (build or plan). -- Leave model unset by default. Only add `--model` or `--free` when the user explicitly asks for a specific model or a free-tier pick. `--free` and `--model` are mutually exclusive. -- Treat `--agent `, `--model `, and `--free` as runtime controls and do not include them in the task text you pass through. -- If the request includes `--worktree`, pass `--worktree` through to `task`. This runs OpenCode in an isolated git worktree instead of editing the working directory in-place. -- Default to a write-capable OpenCode run by adding `--write` unless the user explicitly asks for read-only behavior or only wants review, diagnosis, or research without edits. -- Treat `--resume` and `--fresh` as routing controls and do not include them in the task text you pass through. -- `--resume` means add `--resume-last`. -- `--fresh` means do not add `--resume-last`. -- If the user is clearly asking to continue prior OpenCode work in this repository, such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper", add `--resume-last` unless `--fresh` is present. -- Otherwise forward the task as a fresh `task` run. -- Preserve the user's task text as-is apart from stripping routing flags. -- Return the stdout of the `opencode-companion` command exactly as-is. -- If the Bash call fails or OpenCode cannot be invoked, return nothing. - -Response style: - -- Do not add commentary before or after the forwarded `opencode-companion` output. diff --git a/plugins/opencode/commands/rescue.md b/plugins/opencode/commands/rescue.md index 2545680..ff8de44 100644 --- a/plugins/opencode/commands/rescue.md +++ b/plugins/opencode/commands/rescue.md @@ -1,27 +1,25 @@ --- -description: Delegate investigation, an explicit fix request, or follow-up rescue work to the OpenCode rescue subagent -argument-hint: "[--background|--wait] [--worktree] [--resume|--fresh] [--model | --free] [--agent ] [what OpenCode should investigate, solve, or continue]" +description: Delegate investigation, an explicit fix request, or follow-up rescue work to OpenCode +argument-hint: "[--background] [--worktree] [--resume|--fresh] [--model | --free] [--agent ] [what OpenCode should investigate, solve, or continue]" context: fork allowed-tools: Bash(node:*), AskUserQuestion --- -Route this request to the `opencode:opencode-rescue` subagent. -The final user-visible response must be OpenCode's output verbatim. +Forward this request to the OpenCode companion's `task` runtime via a single Bash call. +The final user-visible response must be the companion's stdout verbatim — no commentary, summary, or paraphrase before or after it. Raw user request: $ARGUMENTS Execution mode: -- If the request includes `--background`, run the `opencode:opencode-rescue` subagent in the background. -- If the request includes `--wait`, run the `opencode:opencode-rescue` subagent in the foreground. -- If neither flag is present, default to foreground. -- `--background` and `--wait` are execution flags for Claude Code. Do not forward them to `task`, and do not treat them as part of the natural-language task text. -- `--model`, `--free`, and `--agent` are runtime-selection flags. Preserve them for the forwarded `task` call, but do not treat them as part of the natural-language task text. `--free` tells the companion to pick a random first-party `opencode/*` free-tier model from `opencode models`; it is restricted to `opencode/*` because OpenRouter free models have inconsistent tool-use support. `--free` is mutually exclusive with `--model`. -- `--worktree` is an isolation flag. Preserve it for the forwarded `task` call, but do not treat it as part of the natural-language task text. When present, OpenCode runs in an isolated git worktree instead of editing the working directory in-place. -- If the request includes `--resume`, do not ask whether to continue. The user already chose. -- If the request includes `--fresh`, do not ask whether to continue. The user already chose. -- Otherwise, before starting OpenCode, check for a resumable rescue session from this Claude session by running: +- Default to foreground: run `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ...` and wait for it to finish. The Bash call must run in the foreground — never set `run_in_background: true` on the Bash tool, even when the user asks for `--background`. +- If the request includes `--background`, append `--background` to the `task` invocation. The companion will spawn a detached worker and return a job id immediately; you then return that job id (with the `Check /opencode:status for progress.` line the companion prints) verbatim. +- `--background` is the only execution-mode flag. Strip it from the natural-language task text before forwarding. `--wait` is accepted as a no-op alias for the default (foreground) mode and must also be stripped from the task text — do not forward it to `task`. +- `--model`, `--free`, and `--agent` are runtime-selection flags. Strip them from the task text and forward each to `task` unchanged. `--free` tells the companion to pick a random first-party `opencode/*` free-tier model from `opencode models`; it is restricted to `opencode/*` because OpenRouter free models have inconsistent tool-use support. `--free` is mutually exclusive with `--model`. +- `--worktree` is an isolation flag. Strip it from the task text and forward it to `task`. When present, OpenCode runs in an isolated git worktree instead of editing the working directory in-place. +- `--resume` and `--fresh` are routing controls. Strip them from the task text. `--resume` becomes `--resume-last` on the forwarded `task` call. `--fresh` means do not add `--resume-last`, regardless of how the natural-language request reads. +- If neither `--resume` nor `--fresh` is present, before starting OpenCode, check for a resumable rescue session from this Claude session by running: ```bash node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task-resume-candidate --json @@ -33,20 +31,22 @@ node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task-resume-candidat - `Start a new OpenCode session` - If the user is clearly giving a follow-up instruction such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper", put `Continue current OpenCode session (Recommended)` first. - Otherwise put `Start a new OpenCode session (Recommended)` first. -- If the user chooses continue, add `--resume` before routing to the subagent. -- If the user chooses a new session, add `--fresh` before routing to the subagent. -- If the helper reports `available: false`, do not ask. Route normally. +- If the user chooses continue, add `--resume-last` to the forwarded `task` call. +- If the user chooses a new session, do not add `--resume-last`. +- If the helper reports `available: false`, do not ask. Forward normally. -Operating rules: +Forwarding rules: -- The subagent is a thin forwarder only. It should use one `Bash` call to invoke `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ...` and return that command's stdout as-is. -- Return the OpenCode companion stdout verbatim to the user. -- Do not paraphrase, summarize, rewrite, or add commentary before or after it. -- Do not ask the subagent to inspect files, monitor progress, poll `/opencode:status`, fetch `/opencode:result`, call `/opencode:cancel`, summarize output, or do follow-up work of its own. +- Use exactly one foreground Bash call to invoke `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ...`. +- Pass the user's task text as a single positional argument to `task`, after all flags. Preserve it as-is apart from stripping the routing flags listed above. +- If the user is clearly asking to continue prior OpenCode work in this repository (such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper") and `--fresh` is not present, add `--resume-last` even if neither `--resume` nor the resume prompt above produced one. +- Rescue is always write-capable. The companion's `task` runtime defaults to write mode and has no read-only switch. If the user wants a read-only diagnosis, point them at `/opencode:review` or `/opencode:adversarial-review` instead. - Leave `--agent` unset unless the user explicitly asks for a specific agent (build or plan). -- Leave the model unset unless the user explicitly asks for a specific model or `--free`. -- Leave `--resume` and `--fresh` in the forwarded request. The subagent handles that routing when it builds the `task` command. -- If the helper reports that OpenCode is missing or unauthenticated, stop and tell the user to run `/opencode:setup`. +- Leave the model unset unless the user explicitly asks for a specific model or `--free`. `--free` and `--model` are mutually exclusive — if both are present, return nothing and tell the user to pick one. +- Return the stdout of the `task` command exactly as-is. +- Do not paraphrase, summarize, rewrite, or add commentary before or after it. +- Do not inspect the repository, read files, grep, monitor progress, poll `/opencode:status`, fetch `/opencode:result`, call `/opencode:cancel`, summarize output, or do any follow-up work of your own. +- If the Bash call fails or OpenCode cannot be invoked, return the failure stderr verbatim. If the helper reports that OpenCode is missing or unauthenticated, stop and tell the user to run `/opencode:setup`. - If the user did not supply a request, check for a saved review from `/opencode:review` or `/opencode:adversarial-review`: ```bash diff --git a/plugins/opencode/skills/opencode-runtime/SKILL.md b/plugins/opencode/skills/opencode-runtime/SKILL.md deleted file mode 100644 index 192caa3..0000000 --- a/plugins/opencode/skills/opencode-runtime/SKILL.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -name: opencode-runtime -description: Internal helper contract for calling the opencode-companion runtime from Claude Code -user-invocable: false ---- - -# OpenCode Runtime - -Use this skill only inside the `opencode:opencode-rescue` subagent. - -Primary helper: -- `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ""` - -Execution rules: -- The rescue subagent is a forwarder, not an orchestrator. Its only job is to invoke `task` once and return that stdout unchanged. -- Prefer the helper over hand-rolled `git`, direct OpenCode CLI strings, or any other Bash activity. -- Do not call `setup`, `review`, `adversarial-review`, `status`, `result`, or `cancel` from `opencode:opencode-rescue`. -- Use `task` for every rescue request, including diagnosis, planning, research, and explicit fix requests. -- You may use the `opencode-prompting` skill to rewrite the user's request into a tighter OpenCode prompt before the single `task` call. -- That prompt drafting is the only Claude-side work allowed. Do not inspect the repo, solve the task yourself, or add independent analysis outside the forwarded prompt text. -- Leave `--agent` unset unless the user explicitly requests a specific agent (build or plan). -- Leave model unset by default. Add `--model` only when the user explicitly asks for one, or `--free` when they explicitly ask for a free-tier pick. `--free` and `--model` are mutually exclusive. - -Command selection: -- Use exactly one `task` invocation per rescue handoff. -- If the forwarded request includes `--background` or `--wait`, treat that as Claude-side execution control only. Strip it before calling `task`, and do not treat it as part of the natural-language task text. -- If the forwarded request includes `--model`, pass it through to `task`. -- If the forwarded request includes `--free`, pass it through to `task`. The companion will shell out to `opencode models`, filter for first-party `opencode/*` free-tier entries (`:free` or `-free`), and pick one at random. `--free` is restricted to `opencode/*` because OpenRouter free models have inconsistent tool-use support. -- If the forwarded request includes both `--free` and `--model`, do not invoke `task` — return nothing, because the companion will reject the combination. -- If the forwarded request includes `--agent`, pass it through to `task`. -- If the forwarded request includes `--worktree`, pass it through to `task`. This runs OpenCode in a disposable git worktree so the user can keep or discard the change after the run finishes. -- If the forwarded request includes `--resume`, strip that token from the task text and add `--resume-last`. -- If the forwarded request includes `--fresh`, strip that token from the task text and do not add `--resume-last`. -- `--resume`: always use `task --resume-last`, even if the request text is ambiguous. -- `--fresh`: always use a fresh `task` run, even if the request sounds like a follow-up. - -Safety rules: -- Default to write-capable OpenCode work in `opencode:opencode-rescue` unless the user explicitly asks for read-only behavior. -- Preserve the user's task text as-is apart from stripping routing flags. -- Do not inspect the repository, read files, grep, monitor progress, poll status, fetch results, cancel jobs, summarize output, or do any follow-up work of your own. -- Return the stdout of the `task` command exactly as-is. -- If the Bash call fails or OpenCode cannot be invoked, return nothing. From 9a177ee94f160d2daf2a09f195fdd9b2e9c17911 Mon Sep 17 00:00:00 2001 From: JohnnyVicious Date: Mon, 13 Apr 2026 10:11:06 +0200 Subject: [PATCH 2/3] fix(reaper): skip SIGTERM when tracked jobs are in-flight MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit reapServerIfOurs previously killed the plugin-spawned OpenCode server purely on uptime (`now - startedAt >= 5 min`), with no check for in-flight tracked jobs. If a SessionEnd fired while a long rescue or review was mid-`sendPrompt`, the reaper SIGTERM'd the server, every open socket closed, and the caller saw bare `terminated` from undici with no matching timeout error in our own code path. Fix: - Rename the `idleMs` local to `uptimeMs` to match what the code actually computes (the original name was misleading — it was never true idle time). - Before SIGTERM, scan state.jobs for any `running` entry whose companion PID is still alive (via the shared token-aware isProcessAlive). If any exist, skip the reap. - Stale `running` entries from a crashed companion are filtered out automatically (their PID is dead), so they cannot permanently block reaping. - If loading state.jobs fails, fail safe: assume in-flight work may exist and leave the server alive. The next SessionEnd will retry. session-lifecycle-hook.mjs: reorder SessionEnd steps so orphaned-job cleanup runs BEFORE the reap. Without this, a stale `running` entry whose companion process died abnormally would not get marked failed until after the reaper had already looked at it, and the reaper would skip on the first SessionEnd post-crash. Orphan cleanup first, then reap, means the first SessionEnd both cleans and reaps. Tests: new tests/reaper.test.mjs covers four scenarios by spawning a throwaway `sleep` child as the "server" PID and driving reapServerIfOurs directly: - live in-flight job → reap skipped, sleeper still alive, state intact - no running jobs → reap proceeds, sleeper SIGTERM'd, state cleared - stale running job (dead PID) → reap proceeds (no permanent block) - uptime below threshold → reap skipped, sleeper alive 176/176 tests pass. Discovered while investigating the failure mode behind the #17 inlining work: a /opencode:adversarial-review background run died at 4m35s with the exact `terminated` error this bug would produce. The specific failure was ruled out by PID evidence (the running OpenCode server was not plugin-tracked), but the same code path is a latent footgun for any user whose server IS plugin-spawned. --- .../opencode/scripts/lib/opencode-server.mjs | 59 ++++- .../scripts/session-lifecycle-hook.mjs | 21 +- tests/reaper.test.mjs | 232 ++++++++++++++++++ 3 files changed, 292 insertions(+), 20 deletions(-) create mode 100644 tests/reaper.test.mjs diff --git a/plugins/opencode/scripts/lib/opencode-server.mjs b/plugins/opencode/scripts/lib/opencode-server.mjs index 456d37f..c2bb308 100644 --- a/plugins/opencode/scripts/lib/opencode-server.mjs +++ b/plugins/opencode/scripts/lib/opencode-server.mjs @@ -24,7 +24,8 @@ import os from "node:os"; import { spawn } from "node:child_process"; import fs from "node:fs"; import path from "node:path"; -import { platformShellOption } from "./process.mjs"; +import { platformShellOption, isProcessAlive as isProcessAliveWithToken } from "./process.mjs"; +import { loadState } from "./state.mjs"; const DEFAULT_PORT = 4096; const DEFAULT_HOST = "127.0.0.1"; @@ -330,16 +331,17 @@ export async function connect(opts = {}) { /** * Reap the plugin-spawned OpenCode server on SessionEnd. * - * Only kills what we started (tracked via server.json `lastServerPid`). - * Guards against killing a server the user started independently by - * re-checking isServerRunning after the kill attempt — if something - * else owns the port, leave it alone. + * Only kills what we started (tracked via server.json `lastServerPid`), + * and only when the plugin has no in-flight tracked-jobs. The previous + * implementation gated solely on `now - startedAt > 5 min`, which would + * SIGTERM the OpenCode server out from under an actively-streaming + * `sendPrompt` call if a SessionEnd happened to fire during a long + * rescue or review. Callers would see the socket drop as `terminated` + * with no timeout error in our own code path. * - * Defense-in-depth: if the server has been running longer than - * SERVER_REAP_IDLE_TIMEOUT (5 min) without a successful health check - * from a plugin client, it is also considered orphanable even if the - * PID is still alive (user may have killed the Claude session while the - * server sat idle). + * The guard is now two conditions: + * 1. server uptime is above SERVER_REAP_IDLE_TIMEOUT, and + * 2. no tracked job is in `running` state with a live companion PID. * * @param {string} workspacePath * @param {{ port?: number, host?: string }} [opts] @@ -360,8 +362,14 @@ export async function reapServerIfOurs(workspacePath, opts = {}) { return false; } - const idleMs = startedAt ? Date.now() - startedAt : Infinity; - if (idleMs < SERVER_REAP_IDLE_TIMEOUT) return false; + const uptimeMs = startedAt ? Date.now() - startedAt : Infinity; + if (uptimeMs < SERVER_REAP_IDLE_TIMEOUT) return false; + + // Do not kill the server if any tracked job is still running against + // it. Orphaned `running` jobs (process crashed without marking failed) + // are filtered out via the shared token-aware liveness check, so a + // stale state entry cannot permanently block reaping. + if (hasInFlightTrackedJob(workspacePath)) return false; try { process.kill(pid, "SIGTERM"); @@ -379,3 +387,30 @@ export async function reapServerIfOurs(workspacePath, opts = {}) { return false; } + +/** + * Returns true if any tracked job is in `running` state with a + * companion PID that is still alive. Stale `running` entries from a + * crashed companion are treated as not-in-flight so the reaper can make + * progress. + * @param {string} workspacePath + * @returns {boolean} + */ +function hasInFlightTrackedJob(workspacePath) { + let jobsState; + try { + jobsState = loadState(workspacePath); + } catch { + // If the job state file is unreadable, fail safe: assume in-flight + // work may exist and keep the server alive. The next SessionEnd + // will retry. + return true; + } + const jobs = Array.isArray(jobsState?.jobs) ? jobsState.jobs : []; + for (const job of jobs) { + if (job?.status !== "running") continue; + if (!job.pid) continue; + if (isProcessAliveWithToken(job.pid, job.pidStartToken)) return true; + } + return false; +} diff --git a/plugins/opencode/scripts/session-lifecycle-hook.mjs b/plugins/opencode/scripts/session-lifecycle-hook.mjs index 7312fd2..9c4a592 100644 --- a/plugins/opencode/scripts/session-lifecycle-hook.mjs +++ b/plugins/opencode/scripts/session-lifecycle-hook.mjs @@ -22,14 +22,11 @@ async function main() { } if (event === "SessionEnd") { - // Reap the plugin-spawned OpenCode server (if idle > 5 min). - try { - await reapServerIfOurs(workspace); - } catch { - // best-effort - } - - // Clean up: check for any orphaned running jobs and mark them as failed + // Clean up orphaned running jobs first. This has to run before + // reapServerIfOurs because the reaper now checks for in-flight + // tracked jobs before killing the server, and stale `running` + // entries from a crashed companion would otherwise keep the server + // alive forever. const state = loadState(workspace); const runningJobs = (state.jobs ?? []).filter((j) => j.status === "running"); @@ -50,6 +47,14 @@ async function main() { } } } + + // Then reap the plugin-spawned OpenCode server (if uptime > 5 min + // AND no in-flight tracked jobs are running against it). + try { + await reapServerIfOurs(workspace); + } catch { + // best-effort + } } } diff --git a/tests/reaper.test.mjs b/tests/reaper.test.mjs new file mode 100644 index 0000000..e791550 --- /dev/null +++ b/tests/reaper.test.mjs @@ -0,0 +1,232 @@ +// Tests for reapServerIfOurs — specifically the in-flight tracked-job +// guard that was added after a background review was killed mid-stream +// by a SessionEnd reap (issue: foreground/background SIGTERM race that +// surfaced as bare `terminated` from undici). +// +// These tests do not touch a real OpenCode server. Instead they: +// +// 1. Spawn a throwaway `sleep` child as the "server" PID so the +// reaper has a live but kill-safe process to target. +// 2. Write server.json and state.json directly under +// CLAUDE_PLUGIN_DATA to set up the tracked state. +// 3. Use a free high port so `isServerRunning` returns false on the +// post-kill verification step. +// +// Tests the matrix: +// - in-flight running job with a live companion PID → reap skipped, +// sleeper still alive +// - no running jobs → reap proceeds, sleeper got SIGTERM +// - stale running job whose companion PID is dead → reap proceeds +// (the stale entry must not block reaping forever) + +import { describe, it, beforeEach, afterEach } from "node:test"; +import assert from "node:assert/strict"; +import fs from "node:fs"; +import os from "node:os"; +import path from "node:path"; +import crypto from "node:crypto"; +import { spawn } from "node:child_process"; + +import { reapServerIfOurs } from "../plugins/opencode/scripts/lib/opencode-server.mjs"; +import { saveState } from "../plugins/opencode/scripts/lib/state.mjs"; + +// PID 999999 is well above pid_max on typical Linux/macOS and is a +// reliably-dead sentinel. +const DEAD_PID = 999_999; + +// Free high port nothing listens on. isServerRunning(TEST_HOST, this) +// should return false so the reaper's post-kill verification can +// conclude the port is empty. +const UNUSED_PORT = 24_099; +const TEST_HOST = "127.0.0.1"; + +function workspaceHash(workspacePath) { + return crypto.createHash("sha256").update(workspacePath).digest("hex").slice(0, 16); +} + +function serverStatePathFor(pluginData, workspace) { + return path.join(pluginData, "state", workspaceHash(workspace), "server.json"); +} + +function writeServerState(pluginData, workspace, state) { + const p = serverStatePathFor(pluginData, workspace); + fs.mkdirSync(path.dirname(p), { recursive: true }); + fs.writeFileSync(p, JSON.stringify(state, null, 2)); +} + +function readServerState(pluginData, workspace) { + const p = serverStatePathFor(pluginData, workspace); + if (!fs.existsSync(p)) return {}; + return JSON.parse(fs.readFileSync(p, "utf8")); +} + +function spawnSleeper() { + // 30s is long enough that the reaper's 1s post-kill poll will catch + // a live process in the in-flight test, but if the kill fires the + // sleeper should exit well within that window. + const proc = spawn("sleep", ["30"], { stdio: "ignore", detached: true }); + proc.unref(); + return proc; +} + +function isPidAlive(pid) { + try { + process.kill(pid, 0); + return true; + } catch { + return false; + } +} + +async function waitUntilDead(pid, timeoutMs = 3_000) { + const deadline = Date.now() + timeoutMs; + while (Date.now() < deadline) { + if (!isPidAlive(pid)) return true; + await new Promise((r) => setTimeout(r, 50)); + } + return !isPidAlive(pid); +} + +describe("reapServerIfOurs in-flight guard", () => { + let workspace; + let pluginData; + let previousPluginData; + let sleeper; + + beforeEach(() => { + workspace = fs.mkdtempSync(path.join(os.tmpdir(), "opencode-reaper-ws-")); + pluginData = fs.mkdtempSync(path.join(os.tmpdir(), "opencode-reaper-data-")); + previousPluginData = process.env.CLAUDE_PLUGIN_DATA; + process.env.CLAUDE_PLUGIN_DATA = pluginData; + sleeper = spawnSleeper(); + }); + + afterEach(async () => { + if (sleeper && sleeper.pid && isPidAlive(sleeper.pid)) { + try { process.kill(sleeper.pid, "SIGKILL"); } catch {} + } + if (previousPluginData == null) delete process.env.CLAUDE_PLUGIN_DATA; + else process.env.CLAUDE_PLUGIN_DATA = previousPluginData; + try { fs.rmSync(workspace, { recursive: true, force: true }); } catch {} + try { fs.rmSync(pluginData, { recursive: true, force: true }); } catch {} + }); + + it("skips reap when a live tracked job is still running", async () => { + // Server tracked, uptime well over the 5-min threshold. + writeServerState(pluginData, workspace, { + lastServerPid: sleeper.pid, + lastServerStartedAt: Date.now() - 10 * 60 * 1000, + }); + // In-flight job pointing at the current test process (which is + // definitely alive) so the in-flight guard triggers. + saveState(workspace, { + config: {}, + jobs: [ + { + id: "inflight-1", + type: "task", + status: "running", + pid: process.pid, + createdAt: new Date().toISOString(), + updatedAt: new Date().toISOString(), + }, + ], + }); + + const reaped = await reapServerIfOurs(workspace, { + host: TEST_HOST, + port: UNUSED_PORT, + }); + + assert.equal(reaped, false, "reaper should skip when in-flight jobs exist"); + assert.equal(isPidAlive(sleeper.pid), true, "sleeper must still be alive"); + + // Server state must not be cleared when reap is skipped. + const after = readServerState(pluginData, workspace); + assert.equal(after.lastServerPid, sleeper.pid); + }); + + it("proceeds when no running jobs reference the server", async () => { + writeServerState(pluginData, workspace, { + lastServerPid: sleeper.pid, + lastServerStartedAt: Date.now() - 10 * 60 * 1000, + }); + // No running jobs — only a completed one, which must not block. + saveState(workspace, { + config: {}, + jobs: [ + { + id: "done-1", + type: "task", + status: "completed", + pid: null, + createdAt: new Date().toISOString(), + updatedAt: new Date().toISOString(), + }, + ], + }); + + const reaped = await reapServerIfOurs(workspace, { + host: TEST_HOST, + port: UNUSED_PORT, + }); + + // We asked the reaper to SIGTERM the sleeper. Give it a moment to die. + const died = await waitUntilDead(sleeper.pid); + assert.equal(died, true, "sleeper must have been killed by the reaper"); + assert.equal(reaped, true, "reaper must report success when nothing was in-flight"); + + // Server state must be cleared after a successful reap. + const after = readServerState(pluginData, workspace); + assert.equal(after.lastServerPid, null); + assert.equal(after.lastServerStartedAt, null); + }); + + it("treats a stale `running` job with a dead PID as not-in-flight", async () => { + writeServerState(pluginData, workspace, { + lastServerPid: sleeper.pid, + lastServerStartedAt: Date.now() - 10 * 60 * 1000, + }); + // A previous companion crashed without marking its job failed. + // The stale entry must not permanently block reaping. + saveState(workspace, { + config: {}, + jobs: [ + { + id: "stale-1", + type: "task", + status: "running", + pid: DEAD_PID, + createdAt: new Date().toISOString(), + updatedAt: new Date().toISOString(), + }, + ], + }); + + const reaped = await reapServerIfOurs(workspace, { + host: TEST_HOST, + port: UNUSED_PORT, + }); + + const died = await waitUntilDead(sleeper.pid); + assert.equal(died, true, "sleeper must be killed — the stale job must not block"); + assert.equal(reaped, true); + }); + + it("does not reap when uptime is below the 5-minute threshold", async () => { + writeServerState(pluginData, workspace, { + lastServerPid: sleeper.pid, + // Fresh server — uptime < 5 min. + lastServerStartedAt: Date.now() - 30 * 1000, + }); + saveState(workspace, { config: {}, jobs: [] }); + + const reaped = await reapServerIfOurs(workspace, { + host: TEST_HOST, + port: UNUSED_PORT, + }); + + assert.equal(reaped, false); + assert.equal(isPidAlive(sleeper.pid), true, "fresh server must not be killed"); + }); +}); From 7e0f235cfcb1e479b092b9a46c4ea138595626be Mon Sep 17 00:00:00 2001 From: JohnnyVicious Date: Mon, 13 Apr 2026 14:31:13 +0200 Subject: [PATCH 3/3] fix(rescue): route task payload through safe-command.mjs bridge MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses three review comments on PR #58 that blocked the rescue inlining from merging. 1. Shell-injection concern (rescue.md:41). The previous inlined command asked Claude to construct a Bash(node:*) invocation directly from user-controlled task text with no escaping rules. Even though Claude was told to pass the text as a single positional argument, nothing prevented shell metacharacters in the text from being interpreted before opencode-companion.mjs received them. Fix: extend safe-command.mjs with a new `task` branch, and rewrite rescue.md to pipe the raw $ARGUMENTS through a single-quoted heredoc into that bridge. The heredoc is the security boundary — no byte inside can be expanded by the shell. safe-command.mjs then: - Peels recognized `--flag [value]` tokens off the front of the payload one at a time (using a whitespace regex, NOT the existing shell-style tokenizer, because natural-language task text with apostrophes like "what's broken" would throw under shell rules). - Validates each value-flag against a strict character class or allowlist (--model: [A-Za-z0-9._/:-]+; --agent: build|plan). - Translates user-facing `--resume`/`--resume-last` to the companion-native `--resume-last`, and drops `--wait`/`--fresh` which are documented no-ops at this layer. - Rejects `--free` combined with `--model` (mutually exclusive). - Rejects any other `--`-prefixed token as an unknown flag. - Forwards the remaining untouched payload as a single verbatim task text positional argument. The existing setup/cancel/result/status branches are unchanged. 2. "Exactly one Bash call" rule conflicted with helper calls (rescue.md:40). The old wording said "use exactly one Bash call" when the command file still needs to call `task-resume-candidate`, `last-review`, and `last-review --content` before the final task invocation. Rewrote rescue.md to split helper calls (preliminary information probes, separately allowed) from the final bridged `safe-command.mjs task` call (exactly one, always foreground). 3. --wait dropped from argument-hint and README (rescue.md:3). The earlier commit removed --wait from both, but the command contract still accepts it as a documented no-op alias. Restored --wait in argument-hint, README slash-command description, and the bridge's recognized-flags list so the compatibility behavior stays discoverable. safe-command.mjs is also now testable: buildForwardArgs and helpers are exported, and the script body is gated behind a `fileURLToPath(import.meta.url) === process.argv[1]` main-guard so importing the module from a test does not swallow stdin or exit. Tests: new tests/safe-command-task.test.mjs (25 cases) exercises every branch of parseTaskArgs directly through buildForwardArgs: - happy paths for every recognized flag and combinations - task text shell-insulation (apostrophes, backticks, $(), |, ;, &, quotes, newlines, literal `--verbose` in middle of text) - internal-whitespace preservation - rejection paths (unknown flags, missing values, invalid model/agent values, --free + --model) - empty-payload passthrough (companion rejects empty task text at its own layer) 201/201 tests pass. --- README.md | 2 +- plugins/opencode/commands/rescue.md | 100 ++++++----- plugins/opencode/scripts/safe-command.mjs | 162 +++++++++++++++-- tests/safe-command-task.test.mjs | 205 ++++++++++++++++++++++ 4 files changed, 411 insertions(+), 58 deletions(-) create mode 100644 tests/safe-command-task.test.mjs diff --git a/README.md b/README.md index a8130fa..07d1484 100644 --- a/README.md +++ b/README.md @@ -99,7 +99,7 @@ To check your configured providers: - `/opencode:review` -- Normal OpenCode code review (read-only). Supports `--base `, `--pr `, `--model `, `--free`, `--wait`, and `--background`. - `/opencode:adversarial-review` -- Steerable review that challenges implementation and design decisions. Supports `--base `, `--pr `, `--model `, `--free`, `--wait`, `--background`, and custom focus text. -- `/opencode:rescue` -- Delegates a task to OpenCode via a direct call to the companion's `task` runtime. Supports `--model`, `--free`, `--agent`, `--resume`, `--fresh`, `--worktree`, and `--background`. Foreground is the default; `--background` detaches a worker and returns a job id you can poll with `/opencode:status`. +- `/opencode:rescue` -- Delegates a task to OpenCode via the `safe-command.mjs` bridge, which validates flags and feeds the task text through a shell-insulated heredoc. Supports `--model`, `--free`, `--agent`, `--resume`, `--fresh`, `--worktree`, `--wait`, and `--background`. Foreground is the default; `--wait` is an explicit no-op alias for foreground; `--background` detaches a worker and returns a job id you can poll with `/opencode:status`. - `/opencode:status` -- Shows running/recent OpenCode jobs for the current repo. - `/opencode:result` -- Shows final output for a finished job, including OpenCode session ID for resuming. - `/opencode:cancel` -- Cancels an active OpenCode job. diff --git a/plugins/opencode/commands/rescue.md b/plugins/opencode/commands/rescue.md index ff8de44..1058a63 100644 --- a/plugins/opencode/commands/rescue.md +++ b/plugins/opencode/commands/rescue.md @@ -1,69 +1,87 @@ --- description: Delegate investigation, an explicit fix request, or follow-up rescue work to OpenCode -argument-hint: "[--background] [--worktree] [--resume|--fresh] [--model | --free] [--agent ] [what OpenCode should investigate, solve, or continue]" +argument-hint: "[--background|--wait] [--worktree] [--resume|--fresh] [--model | --free] [--agent ] [what OpenCode should investigate, solve, or continue]" context: fork allowed-tools: Bash(node:*), AskUserQuestion --- -Forward this request to the OpenCode companion's `task` runtime via a single Bash call. -The final user-visible response must be the companion's stdout verbatim — no commentary, summary, or paraphrase before or after it. +Forward this request to the OpenCode companion's `task` runtime through the `safe-command.mjs` bridge. +The final user-visible response must be the bridge's stdout verbatim — no commentary, summary, or paraphrase before or after it. Raw user request: $ARGUMENTS +Helper calls the command may make BEFORE the final bridged `task` invocation: + +- `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task-resume-candidate --json` — resume-detection helper +- `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review` — last-review presence check +- `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review --content` — last-review full text + +Each of these is a separate Bash call that runs in the foreground and returns quickly. They are NOT the final task invocation — they are just information probes that inform how you build the final bridged call. Do not background them, and do not forward their output to the user. + Execution mode: -- Default to foreground: run `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ...` and wait for it to finish. The Bash call must run in the foreground — never set `run_in_background: true` on the Bash tool, even when the user asks for `--background`. -- If the request includes `--background`, append `--background` to the `task` invocation. The companion will spawn a detached worker and return a job id immediately; you then return that job id (with the `Check /opencode:status for progress.` line the companion prints) verbatim. -- `--background` is the only execution-mode flag. Strip it from the natural-language task text before forwarding. `--wait` is accepted as a no-op alias for the default (foreground) mode and must also be stripped from the task text — do not forward it to `task`. -- `--model`, `--free`, and `--agent` are runtime-selection flags. Strip them from the task text and forward each to `task` unchanged. `--free` tells the companion to pick a random first-party `opencode/*` free-tier model from `opencode models`; it is restricted to `opencode/*` because OpenRouter free models have inconsistent tool-use support. `--free` is mutually exclusive with `--model`. -- `--worktree` is an isolation flag. Strip it from the task text and forward it to `task`. When present, OpenCode runs in an isolated git worktree instead of editing the working directory in-place. -- `--resume` and `--fresh` are routing controls. Strip them from the task text. `--resume` becomes `--resume-last` on the forwarded `task` call. `--fresh` means do not add `--resume-last`, regardless of how the natural-language request reads. -- If neither `--resume` nor `--fresh` is present, before starting OpenCode, check for a resumable rescue session from this Claude session by running: +- Default is foreground: the bridged Bash call runs synchronously and the user sees the companion's full output when the task finishes. +- If the request includes `--background`, the companion detaches a worker and the bridge returns a job id immediately (`OpenCode task started in background: ` plus a `Check /opencode:status for progress.` line). Return that stdout verbatim. +- The Bash call that runs the bridge must always run in the foreground — never set `run_in_background: true` on the Bash tool, even when the user passes `--background`. Companion-level backgrounding is what `--background` forwards to. +- `--wait` is accepted as a documented no-op alias for the default foreground mode. The bridge strips it. +- `--background` and `--wait` are the only execution-mode flags. Every other flag is forwarded to the companion. -```bash -node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task-resume-candidate --json -``` +Flag handling (all of these are recognized, validated, and forwarded by `safe-command.mjs`): + +- `--background` — companion detaches a worker and returns a job id. +- `--wait` — no-op alias for foreground; bridge strips it. +- `--worktree` — run OpenCode in an isolated git worktree instead of editing the working directory in-place. +- `--resume` (or `--resume-last`) — continue the most recent OpenCode session from this Claude session. The bridge translates `--resume` into the companion-native `--resume-last`. +- `--fresh` — explicit marker that the task must NOT resume. The bridge strips it (the absence of `--resume-last` already conveys "fresh"). +- `--model ` — override OpenCode's default model for this single task. Value must match `[A-Za-z0-9._/:-]+`. +- `--free` — tells the companion to pick a random first-party `opencode/*` free-tier model from `opencode models`. Restricted to `opencode/*` because OpenRouter free models have inconsistent tool-use support. +- `--agent ` — override the OpenCode agent. Value must be `build` or `plan`. +- `--free` and `--model` are mutually exclusive — the bridge rejects payloads that include both. If the user supplies both, return the bridge's error verbatim and stop. +Resume detection (runs before the final bridged call, only when neither `--resume` nor `--fresh` is in the raw user request): + +- Run `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task-resume-candidate --json` in the foreground. - If that helper reports `available: true`, use `AskUserQuestion` exactly once to ask whether to continue the current OpenCode session or start a new one. - The two choices must be: - `Continue current OpenCode session` - `Start a new OpenCode session` - If the user is clearly giving a follow-up instruction such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper", put `Continue current OpenCode session (Recommended)` first. - Otherwise put `Start a new OpenCode session (Recommended)` first. -- If the user chooses continue, add `--resume-last` to the forwarded `task` call. -- If the user chooses a new session, do not add `--resume-last`. -- If the helper reports `available: false`, do not ask. Forward normally. - -Forwarding rules: - -- Use exactly one foreground Bash call to invoke `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ...`. -- Pass the user's task text as a single positional argument to `task`, after all flags. Preserve it as-is apart from stripping the routing flags listed above. -- If the user is clearly asking to continue prior OpenCode work in this repository (such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper") and `--fresh` is not present, add `--resume-last` even if neither `--resume` nor the resume prompt above produced one. -- Rescue is always write-capable. The companion's `task` runtime defaults to write mode and has no read-only switch. If the user wants a read-only diagnosis, point them at `/opencode:review` or `/opencode:adversarial-review` instead. -- Leave `--agent` unset unless the user explicitly asks for a specific agent (build or plan). -- Leave the model unset unless the user explicitly asks for a specific model or `--free`. `--free` and `--model` are mutually exclusive — if both are present, return nothing and tell the user to pick one. -- Return the stdout of the `task` command exactly as-is. -- Do not paraphrase, summarize, rewrite, or add commentary before or after it. -- Do not inspect the repository, read files, grep, monitor progress, poll `/opencode:status`, fetch `/opencode:result`, call `/opencode:cancel`, summarize output, or do any follow-up work of your own. -- If the Bash call fails or OpenCode cannot be invoked, return the failure stderr verbatim. If the helper reports that OpenCode is missing or unauthenticated, stop and tell the user to run `/opencode:setup`. -- If the user did not supply a request, check for a saved review from `/opencode:review` or `/opencode:adversarial-review`: +- If the user chooses continue, add `--resume` at the start of the payload you feed to the bridge heredoc. +- If the user chooses a new session, do not add `--resume`. +- If the helper reports `available: false`, do not ask. Proceed without `--resume`. +- If the user is clearly asking to continue prior OpenCode work (such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper") and `--fresh` is not in the raw request, add `--resume` to the payload even when the resume-detection helper was skipped. -```bash -node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review -``` +Empty-request / last-review branch (runs when the raw user request contains no task text): + +- Run `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review` in the foreground. +- If stdout is `LAST_REVIEW_AVAILABLE`, use `AskUserQuestion` exactly once with two options: + - `Fix issues from last review (Recommended)` — prepend the saved review content as context for the rescue task + - `Describe a new task` — ask what OpenCode should investigate or fix +- If the user chooses to fix from last review, read the saved review via `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review --content` and include its stdout verbatim inside the final bridged payload, prefixed with the literal line `The following issues were found in a prior OpenCode review. Please fix them:` followed by a blank line and then the review content. +- If stdout is `NO_LAST_REVIEW`, ask the user what OpenCode should investigate or fix and use their reply as the task text. - - If stdout is `LAST_REVIEW_AVAILABLE`, use `AskUserQuestion` exactly once with two options: - - `Fix issues from last review (Recommended)` — prepend the saved review content as context for the rescue task - - `Describe a new task` — ask what OpenCode should investigate or fix - - If the user chooses to fix from last review, read the saved review via: +Final bridged call (exactly one foreground Bash invocation of `safe-command.mjs`, after any helper calls and user questions above have settled): ```bash -node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review --content +node "${CLAUDE_PLUGIN_ROOT}/scripts/safe-command.mjs" task <<'OPENCODE_TASK' +$ARGUMENTS +OPENCODE_TASK ``` - and include its stdout verbatim in the forwarded task text, prefixed with: +Payload rules: + +- The body of the heredoc above is the bridge's stdin. Replace `$ARGUMENTS` with the adjusted payload you have decided on: any recognized flags first, then the natural-language task text. Preserve the user's task text byte-for-byte; do not paraphrase, re-quote, or escape it. +- If you added `--resume` via the resume-detection branch, the first token of the payload must be `--resume`. +- If the user chose "Fix issues from last review", prepend the literal header line and the review content before the task text, inside the same heredoc body. +- Rescue is always write-capable. The companion's `task` runtime defaults to write mode and has no read-only switch. If the user asks for read-only, point them at `/opencode:review` or `/opencode:adversarial-review` instead. +- The single-quoted heredoc delimiter (`<<'OPENCODE_TASK'`) prevents shell expansion of anything inside the body, so apostrophes, quotes, `$()`, backticks, `;`, `&`, `|`, `<`, `>`, and newlines in the task text are all safe. Do not try to escape them yourself. - `The following issues were found in a prior OpenCode review. Please fix them:\n\n` +Return rules: - - If stdout is `NO_LAST_REVIEW`, ask what OpenCode should investigate or fix. +- Return the bridge stdout exactly as-is. +- Do not paraphrase, summarize, rewrite, or add commentary before or after it. +- Do not inspect the repository, read files, grep, monitor progress, poll `/opencode:status`, fetch `/opencode:result`, call `/opencode:cancel`, summarize output, or do any follow-up work of your own. +- If the bridge rejects the payload (unknown flag, invalid `--model` value, `--free` + `--model`, etc.), return the bridge's stderr verbatim and stop. +- If the helper reports that OpenCode is missing or unauthenticated, stop and tell the user to run `/opencode:setup`. diff --git a/plugins/opencode/scripts/safe-command.mjs b/plugins/opencode/scripts/safe-command.mjs index 0aeae1a..f9174dd 100644 --- a/plugins/opencode/scripts/safe-command.mjs +++ b/plugins/opencode/scripts/safe-command.mjs @@ -11,26 +11,36 @@ import { spawnSync } from "node:child_process"; import fs from "node:fs"; import path from "node:path"; import process from "node:process"; +import { fileURLToPath } from "node:url"; const companionScript = path.join(import.meta.dirname, "opencode-companion.mjs"); -const subcommand = process.argv[2]; -const raw = fs.readFileSync(0, "utf8").trim(); - -try { - const args = buildForwardArgs(subcommand, raw); - const result = spawnSync(process.execPath, [companionScript, ...args], { - cwd: process.cwd(), - env: process.env, - stdio: "inherit", - }); - if (result.error) throw result.error; - process.exit(result.status ?? 1); -} catch (err) { - console.error(err.message); - process.exit(1); + +function main() { + const subcommand = process.argv[2]; + const raw = fs.readFileSync(0, "utf8").trim(); + + try { + const args = buildForwardArgs(subcommand, raw); + const result = spawnSync(process.execPath, [companionScript, ...args], { + cwd: process.cwd(), + env: process.env, + stdio: "inherit", + }); + if (result.error) throw result.error; + process.exit(result.status ?? 1); + } catch (err) { + console.error(err.message); + process.exit(1); + } } -function buildForwardArgs(command, text) { +// Only run the script body when invoked directly. Importing this file +// from a test module must not swallow stdin or exit the process. +if (process.argv[1] && fileURLToPath(import.meta.url) === process.argv[1]) { + main(); +} + +export function buildForwardArgs(command, text) { if (command === "status") { if (text) throw new Error("status does not accept arguments."); return ["status"]; @@ -48,9 +58,129 @@ function buildForwardArgs(command, text) { return ["setup", "--json", ...parseSetupArgs(text)]; } + if (command === "task") { + return ["task", ...parseTaskArgs(text)]; + } + throw new Error(`Unsupported safe command: ${command}`); } +// Parse the rescue payload: a sequence of recognized `--flag [value]` +// tokens at the front, followed by opaque natural-language task text +// that can contain anything (apostrophes, quotes, shell metacharacters, +// newlines) and is forwarded verbatim as a single positional argument. +// +// We deliberately do NOT run splitShellLike across the whole payload +// because the task text is not shell syntax — an apostrophe like +// "what's broken" would throw "Unterminated quoted argument" under a +// shell-style tokenizer. Instead we peel off one flag token at a time +// from the front, stop at the first non-flag token, and treat the rest +// of the stdin blob as a literal string. +// +// Security model: stdin is already shell-insulated by the command +// file's single-quoted heredoc, so no byte in `text` can be executed +// as shell. This parser's only job is to: +// 1. Reject unknown flags so a malicious or buggy caller cannot smuggle +// unsafe arguments through to opencode-companion.mjs. +// 2. Validate the values of value-flags (`--model`, `--agent`) against +// strict character classes / allowlists. +// 3. Translate user-facing routing flags (`--resume`, `--wait`, `--fresh`) +// into companion-native equivalents (or drop them when they are +// documented no-ops). +function parseTaskArgs(text) { + const out = []; + let remaining = text; + let sawModel = false; + let sawFree = false; + + // Consume one whitespace-separated token from the front of `remaining`. + // Returns [token, rest] or null when `remaining` is empty. + function peelToken(source) { + const m = source.match(/^(\S+)(?:\s+([\s\S]*))?$/); + if (!m) return null; + return [m[1], m[2] ?? ""]; + } + + while (remaining.length > 0) { + const peeled = peelToken(remaining); + if (!peeled) break; + const [token, rest] = peeled; + + // --- Boolean flags forwarded as-is --- + if (token === "--background" || token === "--worktree" || token === "--free") { + if (token === "--free") sawFree = true; + out.push(token); + remaining = rest; + continue; + } + + // --- Documented no-ops / routing markers --- + // --wait is a documented alias for "default foreground". The + // companion has no --wait flag, so we drop it here. + // --fresh means "do not add --resume-last", which at this layer is + // also a no-op (we only emit --resume-last when --resume is present). + if (token === "--wait" || token === "--fresh") { + remaining = rest; + continue; + } + + // --- User-facing --resume → companion-native --resume-last --- + if (token === "--resume" || token === "--resume-last") { + out.push("--resume-last"); + remaining = rest; + continue; + } + + // --- Value flag: --model --- + if (token === "--model") { + const valuePeeled = peelToken(rest); + if (!valuePeeled) throw new Error("--model requires a value."); + const [value, afterValue] = valuePeeled; + if (!/^[A-Za-z0-9._/:-]+$/.test(value)) { + throw new Error( + "--model value must match [A-Za-z0-9._/:-]+ (e.g. openrouter/anthropic/claude-haiku-4.5)." + ); + } + sawModel = true; + out.push("--model", value); + remaining = afterValue; + continue; + } + + // --- Value flag: --agent (only build|plan) --- + if (token === "--agent") { + const valuePeeled = peelToken(rest); + if (!valuePeeled) throw new Error("--agent requires a value."); + const [value, afterValue] = valuePeeled; + if (value !== "build" && value !== "plan") { + throw new Error("--agent value must be 'build' or 'plan'."); + } + out.push("--agent", value); + remaining = afterValue; + continue; + } + + // Any other `--`-prefixed token is an unknown flag. Reject rather + // than silently passing it through to opencode-companion.mjs. + if (token.startsWith("--")) { + throw new Error(`Unsupported rescue flag: ${token}`); + } + + // First non-flag token encountered. Everything from here on — this + // token PLUS the rest of the untokenized string — is the task text. + break; + } + + if (sawFree && sawModel) { + throw new Error("--free and --model are mutually exclusive; pick one."); + } + + const taskText = remaining.trim(); + if (taskText) out.push(taskText); + + return out; +} + function parseSetupArgs(text) { const tokens = splitShellLike(text); const out = []; diff --git a/tests/safe-command-task.test.mjs b/tests/safe-command-task.test.mjs new file mode 100644 index 0000000..323e0ce --- /dev/null +++ b/tests/safe-command-task.test.mjs @@ -0,0 +1,205 @@ +// Unit tests for safe-command.mjs `task` bridge parsing. +// +// Only exercises the pure argv-building layer (`buildForwardArgs`), not +// the full spawn chain. The bridge's security contract is: given raw +// stdin text, it (1) rejects unknown flags, (2) validates value-flags, +// (3) translates routing aliases, and (4) forwards everything else as a +// single verbatim task-text positional argument. If the output argv +// matches expectations for a given input, the bridge is safe — no shell +// layer exists between here and opencode-companion.mjs. + +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import { buildForwardArgs } from "../plugins/opencode/scripts/safe-command.mjs"; + +function task(input) { + return buildForwardArgs("task", input); +} + +describe("safe-command task bridge — happy paths", () => { + it("forwards plain task text as a single positional arg", () => { + assert.deepEqual(task("fix the bug in src/foo.js"), ["task", "fix the bug in src/foo.js"]); + }); + + it("peels --background off the front and keeps task text literal", () => { + assert.deepEqual( + task("--background investigate why sessions leak"), + ["task", "--background", "investigate why sessions leak"] + ); + }); + + it("peels multiple boolean flags in order", () => { + assert.deepEqual( + task("--background --worktree --free do the thing"), + ["task", "--background", "--worktree", "--free", "do the thing"] + ); + }); + + it("translates --resume to --resume-last", () => { + assert.deepEqual( + task("--resume keep going"), + ["task", "--resume-last", "keep going"] + ); + }); + + it("accepts --resume-last directly and emits it once", () => { + assert.deepEqual( + task("--resume-last continue from where we left off"), + ["task", "--resume-last", "continue from where we left off"] + ); + }); + + it("strips --wait (no-op alias for foreground)", () => { + assert.deepEqual( + task("--wait diagnose this"), + ["task", "diagnose this"] + ); + }); + + it("strips --fresh (routing marker — absence of --resume-last already conveys 'fresh')", () => { + assert.deepEqual( + task("--fresh start clean"), + ["task", "start clean"] + ); + }); + + it("forwards --model value after basic validation", () => { + assert.deepEqual( + task("--model openrouter/anthropic/claude-haiku-4.5 fix the bug"), + ["task", "--model", "openrouter/anthropic/claude-haiku-4.5", "fix the bug"] + ); + }); + + it("forwards --agent build and --agent plan", () => { + assert.deepEqual( + task("--agent build implement feature"), + ["task", "--agent", "build", "implement feature"] + ); + assert.deepEqual( + task("--agent plan outline the approach"), + ["task", "--agent", "plan", "outline the approach"] + ); + }); + + it("combines many flags and preserves task text literally", () => { + assert.deepEqual( + task("--background --worktree --resume --model provider/model-x --agent build refactor the worker"), + [ + "task", + "--background", + "--worktree", + "--resume-last", + "--model", + "provider/model-x", + "--agent", + "build", + "refactor the worker", + ] + ); + }); +}); + +describe("safe-command task bridge — task text is shell-insulated", () => { + it("passes apostrophes through literally", () => { + assert.deepEqual( + task("--background what's broken here"), + ["task", "--background", "what's broken here"] + ); + }); + + it("passes shell metacharacters through literally as one string", () => { + const taskText = "fix `rm -rf /tmp/x`; echo $(whoami) > /etc/passwd | tee /dev/null & \"oops\""; + assert.deepEqual(task(taskText), ["task", taskText]); + }); + + it("passes newlines through literally", () => { + const taskText = "line one\nline two\nline three"; + assert.deepEqual(task(taskText), ["task", taskText]); + }); + + it("passes leading task text that contains quotes without tokenizing", () => { + const taskText = `he said "this is broken" and she said 'also broken'`; + assert.deepEqual(task(taskText), ["task", taskText]); + }); + + it("preserves internal whitespace in task text after the first non-flag token", () => { + assert.deepEqual( + task("--worktree fix spacing here"), + ["task", "--worktree", "fix spacing here"] + ); + }); + + it("does not re-tokenize dashes in the middle of task text", () => { + assert.deepEqual( + task("check --verbose flag handling in the cli"), + ["task", "check --verbose flag handling in the cli"] + ); + }); + + it("handles task text that starts with a hyphen after a known flag", () => { + // `-x` isn't a recognized flag (not `--`-prefixed) so it becomes + // part of the task text verbatim. + assert.deepEqual( + task("--background -x is not a flag, keep it in the task text"), + ["task", "--background", "-x is not a flag, keep it in the task text"] + ); + }); +}); + +describe("safe-command task bridge — rejection paths", () => { + it("rejects unknown double-dash flags", () => { + assert.throws( + () => task("--evil-flag do something"), + /Unsupported rescue flag: --evil-flag/ + ); + }); + + it("rejects --model without a value", () => { + assert.throws(() => task("--model"), /--model requires a value/); + }); + + it("rejects --model with an invalid value", () => { + assert.throws( + () => task("--model bad;value do things"), + /--model value must match/ + ); + }); + + it("rejects --agent without a value", () => { + assert.throws(() => task("--agent"), /--agent requires a value/); + }); + + it("rejects --agent with a disallowed value", () => { + assert.throws( + () => task("--agent evil run unsafe things"), + /--agent value must be 'build' or 'plan'/ + ); + }); + + it("rejects --free combined with --model", () => { + assert.throws( + () => task("--free --model provider/model fix it"), + /--free and --model are mutually exclusive/ + ); + assert.throws( + () => task("--model provider/model --free fix it"), + /--free and --model are mutually exclusive/ + ); + }); + + it("returns just the subcommand when stdin is empty", () => { + // A completely empty payload is allowed at the bridge layer. + // opencode-companion.mjs's handleTask is responsible for rejecting + // no-task-text invocations; that's covered by its own tests. + assert.deepEqual(task(""), ["task"]); + }); +}); + +describe("safe-command task bridge — command dispatch", () => { + it("still rejects unknown top-level subcommands", () => { + assert.throws( + () => buildForwardArgs("not-a-command", ""), + /Unsupported safe command: not-a-command/ + ); + }); +});