diff --git a/README.md b/README.md index d48ff80..07d1484 100644 --- a/README.md +++ b/README.md @@ -99,7 +99,7 @@ To check your configured providers: - `/opencode:review` -- Normal OpenCode code review (read-only). Supports `--base `, `--pr `, `--model `, `--free`, `--wait`, and `--background`. - `/opencode:adversarial-review` -- Steerable review that challenges implementation and design decisions. Supports `--base `, `--pr `, `--model `, `--free`, `--wait`, `--background`, and custom focus text. -- `/opencode:rescue` -- Delegates a task to OpenCode via the `opencode:opencode-rescue` subagent. Supports `--model`, `--free`, `--agent`, `--resume`, `--fresh`, `--worktree`, `--wait`, and `--background`. +- `/opencode:rescue` -- Delegates a task to OpenCode via the `safe-command.mjs` bridge, which validates flags and feeds the task text through a shell-insulated heredoc. Supports `--model`, `--free`, `--agent`, `--resume`, `--fresh`, `--worktree`, `--wait`, and `--background`. Foreground is the default; `--wait` is an explicit no-op alias for foreground; `--background` detaches a worker and returns a job id you can poll with `/opencode:status`. - `/opencode:status` -- Shows running/recent OpenCode jobs for the current repo. - `/opencode:result` -- Shows final output for a finished job, including OpenCode session ID for resuming. - `/opencode:cancel` -- Cancels an active OpenCode job. @@ -185,7 +185,6 @@ opencode-plugin-cc/ ├── .github/workflows/ci.yml # GitHub Actions CI ├── plugins/opencode/ │ ├── .claude-plugin/plugin.json # Plugin metadata -│ ├── agents/opencode-rescue.md # Rescue subagent definition │ ├── commands/ # 7 slash commands │ │ ├── review.md │ │ ├── adversarial-review.md diff --git a/plugins/opencode/agents/opencode-rescue.md b/plugins/opencode/agents/opencode-rescue.md deleted file mode 100644 index a4701a2..0000000 --- a/plugins/opencode/agents/opencode-rescue.md +++ /dev/null @@ -1,45 +0,0 @@ ---- -name: opencode-rescue -description: Proactively use when Claude Code is stuck, wants a second implementation or diagnosis pass, needs a deeper root-cause investigation, or should hand a substantial coding task to OpenCode through the shared runtime -model: sonnet -tools: Bash -skills: - - opencode-runtime - - opencode-prompting ---- - -You are a thin forwarding wrapper around the OpenCode companion task runtime. - -Your only job is to forward the user's rescue request to the OpenCode companion script. Do not do anything else. - -Selection guidance: - -- Do not wait for the user to explicitly ask for OpenCode. Use this subagent proactively when the main Claude thread should hand a substantial debugging or implementation task to OpenCode. -- Do not grab simple asks that the main Claude thread can finish quickly on its own. - -Forwarding rules: - -- Use exactly one `Bash` call to invoke `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ...`. -- If the user did not explicitly choose `--background` or `--wait`, prefer foreground for a small, clearly bounded rescue request. -- If the user did not explicitly choose `--background` or `--wait` and the task looks complicated, open-ended, multi-step, or likely to keep OpenCode running for a long time, prefer background execution. -- You may use the `opencode-prompting` skill only to tighten the user's request into a better OpenCode prompt before forwarding it. -- Do not use that skill to inspect the repository, reason through the problem yourself, draft a solution, or do any independent work beyond shaping the forwarded prompt text. -- Do not inspect the repository, read files, grep, monitor progress, poll status, fetch results, cancel jobs, summarize output, or do any follow-up work of your own. -- Do not call `review`, `adversarial-review`, `status`, `result`, or `cancel`. This subagent only forwards to `task`. -- Leave `--agent` unset unless the user explicitly requests a specific agent (build or plan). -- Leave model unset by default. Only add `--model` or `--free` when the user explicitly asks for a specific model or a free-tier pick. `--free` and `--model` are mutually exclusive. -- Treat `--agent `, `--model `, and `--free` as runtime controls and do not include them in the task text you pass through. -- If the request includes `--worktree`, pass `--worktree` through to `task`. This runs OpenCode in an isolated git worktree instead of editing the working directory in-place. -- Default to a write-capable OpenCode run by adding `--write` unless the user explicitly asks for read-only behavior or only wants review, diagnosis, or research without edits. -- Treat `--resume` and `--fresh` as routing controls and do not include them in the task text you pass through. -- `--resume` means add `--resume-last`. -- `--fresh` means do not add `--resume-last`. -- If the user is clearly asking to continue prior OpenCode work in this repository, such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper", add `--resume-last` unless `--fresh` is present. -- Otherwise forward the task as a fresh `task` run. -- Preserve the user's task text as-is apart from stripping routing flags. -- Return the stdout of the `opencode-companion` command exactly as-is. -- If the Bash call fails or OpenCode cannot be invoked, return nothing. - -Response style: - -- Do not add commentary before or after the forwarded `opencode-companion` output. diff --git a/plugins/opencode/commands/rescue.md b/plugins/opencode/commands/rescue.md index 2545680..1058a63 100644 --- a/plugins/opencode/commands/rescue.md +++ b/plugins/opencode/commands/rescue.md @@ -1,69 +1,87 @@ --- -description: Delegate investigation, an explicit fix request, or follow-up rescue work to the OpenCode rescue subagent +description: Delegate investigation, an explicit fix request, or follow-up rescue work to OpenCode argument-hint: "[--background|--wait] [--worktree] [--resume|--fresh] [--model | --free] [--agent ] [what OpenCode should investigate, solve, or continue]" context: fork allowed-tools: Bash(node:*), AskUserQuestion --- -Route this request to the `opencode:opencode-rescue` subagent. -The final user-visible response must be OpenCode's output verbatim. +Forward this request to the OpenCode companion's `task` runtime through the `safe-command.mjs` bridge. +The final user-visible response must be the bridge's stdout verbatim — no commentary, summary, or paraphrase before or after it. Raw user request: $ARGUMENTS +Helper calls the command may make BEFORE the final bridged `task` invocation: + +- `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task-resume-candidate --json` — resume-detection helper +- `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review` — last-review presence check +- `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review --content` — last-review full text + +Each of these is a separate Bash call that runs in the foreground and returns quickly. They are NOT the final task invocation — they are just information probes that inform how you build the final bridged call. Do not background them, and do not forward their output to the user. + Execution mode: -- If the request includes `--background`, run the `opencode:opencode-rescue` subagent in the background. -- If the request includes `--wait`, run the `opencode:opencode-rescue` subagent in the foreground. -- If neither flag is present, default to foreground. -- `--background` and `--wait` are execution flags for Claude Code. Do not forward them to `task`, and do not treat them as part of the natural-language task text. -- `--model`, `--free`, and `--agent` are runtime-selection flags. Preserve them for the forwarded `task` call, but do not treat them as part of the natural-language task text. `--free` tells the companion to pick a random first-party `opencode/*` free-tier model from `opencode models`; it is restricted to `opencode/*` because OpenRouter free models have inconsistent tool-use support. `--free` is mutually exclusive with `--model`. -- `--worktree` is an isolation flag. Preserve it for the forwarded `task` call, but do not treat it as part of the natural-language task text. When present, OpenCode runs in an isolated git worktree instead of editing the working directory in-place. -- If the request includes `--resume`, do not ask whether to continue. The user already chose. -- If the request includes `--fresh`, do not ask whether to continue. The user already chose. -- Otherwise, before starting OpenCode, check for a resumable rescue session from this Claude session by running: +- Default is foreground: the bridged Bash call runs synchronously and the user sees the companion's full output when the task finishes. +- If the request includes `--background`, the companion detaches a worker and the bridge returns a job id immediately (`OpenCode task started in background: ` plus a `Check /opencode:status for progress.` line). Return that stdout verbatim. +- The Bash call that runs the bridge must always run in the foreground — never set `run_in_background: true` on the Bash tool, even when the user passes `--background`. Companion-level backgrounding is what `--background` forwards to. +- `--wait` is accepted as a documented no-op alias for the default foreground mode. The bridge strips it. +- `--background` and `--wait` are the only execution-mode flags. Every other flag is forwarded to the companion. -```bash -node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task-resume-candidate --json -``` +Flag handling (all of these are recognized, validated, and forwarded by `safe-command.mjs`): + +- `--background` — companion detaches a worker and returns a job id. +- `--wait` — no-op alias for foreground; bridge strips it. +- `--worktree` — run OpenCode in an isolated git worktree instead of editing the working directory in-place. +- `--resume` (or `--resume-last`) — continue the most recent OpenCode session from this Claude session. The bridge translates `--resume` into the companion-native `--resume-last`. +- `--fresh` — explicit marker that the task must NOT resume. The bridge strips it (the absence of `--resume-last` already conveys "fresh"). +- `--model ` — override OpenCode's default model for this single task. Value must match `[A-Za-z0-9._/:-]+`. +- `--free` — tells the companion to pick a random first-party `opencode/*` free-tier model from `opencode models`. Restricted to `opencode/*` because OpenRouter free models have inconsistent tool-use support. +- `--agent ` — override the OpenCode agent. Value must be `build` or `plan`. +- `--free` and `--model` are mutually exclusive — the bridge rejects payloads that include both. If the user supplies both, return the bridge's error verbatim and stop. +Resume detection (runs before the final bridged call, only when neither `--resume` nor `--fresh` is in the raw user request): + +- Run `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task-resume-candidate --json` in the foreground. - If that helper reports `available: true`, use `AskUserQuestion` exactly once to ask whether to continue the current OpenCode session or start a new one. - The two choices must be: - `Continue current OpenCode session` - `Start a new OpenCode session` - If the user is clearly giving a follow-up instruction such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper", put `Continue current OpenCode session (Recommended)` first. - Otherwise put `Start a new OpenCode session (Recommended)` first. -- If the user chooses continue, add `--resume` before routing to the subagent. -- If the user chooses a new session, add `--fresh` before routing to the subagent. -- If the helper reports `available: false`, do not ask. Route normally. - -Operating rules: +- If the user chooses continue, add `--resume` at the start of the payload you feed to the bridge heredoc. +- If the user chooses a new session, do not add `--resume`. +- If the helper reports `available: false`, do not ask. Proceed without `--resume`. +- If the user is clearly asking to continue prior OpenCode work (such as "continue", "keep going", "resume", "apply the top fix", or "dig deeper") and `--fresh` is not in the raw request, add `--resume` to the payload even when the resume-detection helper was skipped. -- The subagent is a thin forwarder only. It should use one `Bash` call to invoke `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ...` and return that command's stdout as-is. -- Return the OpenCode companion stdout verbatim to the user. -- Do not paraphrase, summarize, rewrite, or add commentary before or after it. -- Do not ask the subagent to inspect files, monitor progress, poll `/opencode:status`, fetch `/opencode:result`, call `/opencode:cancel`, summarize output, or do follow-up work of its own. -- Leave `--agent` unset unless the user explicitly asks for a specific agent (build or plan). -- Leave the model unset unless the user explicitly asks for a specific model or `--free`. -- Leave `--resume` and `--fresh` in the forwarded request. The subagent handles that routing when it builds the `task` command. -- If the helper reports that OpenCode is missing or unauthenticated, stop and tell the user to run `/opencode:setup`. -- If the user did not supply a request, check for a saved review from `/opencode:review` or `/opencode:adversarial-review`: +Empty-request / last-review branch (runs when the raw user request contains no task text): -```bash -node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review -``` +- Run `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review` in the foreground. +- If stdout is `LAST_REVIEW_AVAILABLE`, use `AskUserQuestion` exactly once with two options: + - `Fix issues from last review (Recommended)` — prepend the saved review content as context for the rescue task + - `Describe a new task` — ask what OpenCode should investigate or fix +- If the user chooses to fix from last review, read the saved review via `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review --content` and include its stdout verbatim inside the final bridged payload, prefixed with the literal line `The following issues were found in a prior OpenCode review. Please fix them:` followed by a blank line and then the review content. +- If stdout is `NO_LAST_REVIEW`, ask the user what OpenCode should investigate or fix and use their reply as the task text. - - If stdout is `LAST_REVIEW_AVAILABLE`, use `AskUserQuestion` exactly once with two options: - - `Fix issues from last review (Recommended)` — prepend the saved review content as context for the rescue task - - `Describe a new task` — ask what OpenCode should investigate or fix - - If the user chooses to fix from last review, read the saved review via: +Final bridged call (exactly one foreground Bash invocation of `safe-command.mjs`, after any helper calls and user questions above have settled): ```bash -node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" last-review --content +node "${CLAUDE_PLUGIN_ROOT}/scripts/safe-command.mjs" task <<'OPENCODE_TASK' +$ARGUMENTS +OPENCODE_TASK ``` - and include its stdout verbatim in the forwarded task text, prefixed with: +Payload rules: + +- The body of the heredoc above is the bridge's stdin. Replace `$ARGUMENTS` with the adjusted payload you have decided on: any recognized flags first, then the natural-language task text. Preserve the user's task text byte-for-byte; do not paraphrase, re-quote, or escape it. +- If you added `--resume` via the resume-detection branch, the first token of the payload must be `--resume`. +- If the user chose "Fix issues from last review", prepend the literal header line and the review content before the task text, inside the same heredoc body. +- Rescue is always write-capable. The companion's `task` runtime defaults to write mode and has no read-only switch. If the user asks for read-only, point them at `/opencode:review` or `/opencode:adversarial-review` instead. +- The single-quoted heredoc delimiter (`<<'OPENCODE_TASK'`) prevents shell expansion of anything inside the body, so apostrophes, quotes, `$()`, backticks, `;`, `&`, `|`, `<`, `>`, and newlines in the task text are all safe. Do not try to escape them yourself. - `The following issues were found in a prior OpenCode review. Please fix them:\n\n` +Return rules: - - If stdout is `NO_LAST_REVIEW`, ask what OpenCode should investigate or fix. +- Return the bridge stdout exactly as-is. +- Do not paraphrase, summarize, rewrite, or add commentary before or after it. +- Do not inspect the repository, read files, grep, monitor progress, poll `/opencode:status`, fetch `/opencode:result`, call `/opencode:cancel`, summarize output, or do any follow-up work of your own. +- If the bridge rejects the payload (unknown flag, invalid `--model` value, `--free` + `--model`, etc.), return the bridge's stderr verbatim and stop. +- If the helper reports that OpenCode is missing or unauthenticated, stop and tell the user to run `/opencode:setup`. diff --git a/plugins/opencode/scripts/lib/opencode-server.mjs b/plugins/opencode/scripts/lib/opencode-server.mjs index 456d37f..c2bb308 100644 --- a/plugins/opencode/scripts/lib/opencode-server.mjs +++ b/plugins/opencode/scripts/lib/opencode-server.mjs @@ -24,7 +24,8 @@ import os from "node:os"; import { spawn } from "node:child_process"; import fs from "node:fs"; import path from "node:path"; -import { platformShellOption } from "./process.mjs"; +import { platformShellOption, isProcessAlive as isProcessAliveWithToken } from "./process.mjs"; +import { loadState } from "./state.mjs"; const DEFAULT_PORT = 4096; const DEFAULT_HOST = "127.0.0.1"; @@ -330,16 +331,17 @@ export async function connect(opts = {}) { /** * Reap the plugin-spawned OpenCode server on SessionEnd. * - * Only kills what we started (tracked via server.json `lastServerPid`). - * Guards against killing a server the user started independently by - * re-checking isServerRunning after the kill attempt — if something - * else owns the port, leave it alone. + * Only kills what we started (tracked via server.json `lastServerPid`), + * and only when the plugin has no in-flight tracked-jobs. The previous + * implementation gated solely on `now - startedAt > 5 min`, which would + * SIGTERM the OpenCode server out from under an actively-streaming + * `sendPrompt` call if a SessionEnd happened to fire during a long + * rescue or review. Callers would see the socket drop as `terminated` + * with no timeout error in our own code path. * - * Defense-in-depth: if the server has been running longer than - * SERVER_REAP_IDLE_TIMEOUT (5 min) without a successful health check - * from a plugin client, it is also considered orphanable even if the - * PID is still alive (user may have killed the Claude session while the - * server sat idle). + * The guard is now two conditions: + * 1. server uptime is above SERVER_REAP_IDLE_TIMEOUT, and + * 2. no tracked job is in `running` state with a live companion PID. * * @param {string} workspacePath * @param {{ port?: number, host?: string }} [opts] @@ -360,8 +362,14 @@ export async function reapServerIfOurs(workspacePath, opts = {}) { return false; } - const idleMs = startedAt ? Date.now() - startedAt : Infinity; - if (idleMs < SERVER_REAP_IDLE_TIMEOUT) return false; + const uptimeMs = startedAt ? Date.now() - startedAt : Infinity; + if (uptimeMs < SERVER_REAP_IDLE_TIMEOUT) return false; + + // Do not kill the server if any tracked job is still running against + // it. Orphaned `running` jobs (process crashed without marking failed) + // are filtered out via the shared token-aware liveness check, so a + // stale state entry cannot permanently block reaping. + if (hasInFlightTrackedJob(workspacePath)) return false; try { process.kill(pid, "SIGTERM"); @@ -379,3 +387,30 @@ export async function reapServerIfOurs(workspacePath, opts = {}) { return false; } + +/** + * Returns true if any tracked job is in `running` state with a + * companion PID that is still alive. Stale `running` entries from a + * crashed companion are treated as not-in-flight so the reaper can make + * progress. + * @param {string} workspacePath + * @returns {boolean} + */ +function hasInFlightTrackedJob(workspacePath) { + let jobsState; + try { + jobsState = loadState(workspacePath); + } catch { + // If the job state file is unreadable, fail safe: assume in-flight + // work may exist and keep the server alive. The next SessionEnd + // will retry. + return true; + } + const jobs = Array.isArray(jobsState?.jobs) ? jobsState.jobs : []; + for (const job of jobs) { + if (job?.status !== "running") continue; + if (!job.pid) continue; + if (isProcessAliveWithToken(job.pid, job.pidStartToken)) return true; + } + return false; +} diff --git a/plugins/opencode/scripts/safe-command.mjs b/plugins/opencode/scripts/safe-command.mjs index 0aeae1a..f9174dd 100644 --- a/plugins/opencode/scripts/safe-command.mjs +++ b/plugins/opencode/scripts/safe-command.mjs @@ -11,26 +11,36 @@ import { spawnSync } from "node:child_process"; import fs from "node:fs"; import path from "node:path"; import process from "node:process"; +import { fileURLToPath } from "node:url"; const companionScript = path.join(import.meta.dirname, "opencode-companion.mjs"); -const subcommand = process.argv[2]; -const raw = fs.readFileSync(0, "utf8").trim(); - -try { - const args = buildForwardArgs(subcommand, raw); - const result = spawnSync(process.execPath, [companionScript, ...args], { - cwd: process.cwd(), - env: process.env, - stdio: "inherit", - }); - if (result.error) throw result.error; - process.exit(result.status ?? 1); -} catch (err) { - console.error(err.message); - process.exit(1); + +function main() { + const subcommand = process.argv[2]; + const raw = fs.readFileSync(0, "utf8").trim(); + + try { + const args = buildForwardArgs(subcommand, raw); + const result = spawnSync(process.execPath, [companionScript, ...args], { + cwd: process.cwd(), + env: process.env, + stdio: "inherit", + }); + if (result.error) throw result.error; + process.exit(result.status ?? 1); + } catch (err) { + console.error(err.message); + process.exit(1); + } } -function buildForwardArgs(command, text) { +// Only run the script body when invoked directly. Importing this file +// from a test module must not swallow stdin or exit the process. +if (process.argv[1] && fileURLToPath(import.meta.url) === process.argv[1]) { + main(); +} + +export function buildForwardArgs(command, text) { if (command === "status") { if (text) throw new Error("status does not accept arguments."); return ["status"]; @@ -48,9 +58,129 @@ function buildForwardArgs(command, text) { return ["setup", "--json", ...parseSetupArgs(text)]; } + if (command === "task") { + return ["task", ...parseTaskArgs(text)]; + } + throw new Error(`Unsupported safe command: ${command}`); } +// Parse the rescue payload: a sequence of recognized `--flag [value]` +// tokens at the front, followed by opaque natural-language task text +// that can contain anything (apostrophes, quotes, shell metacharacters, +// newlines) and is forwarded verbatim as a single positional argument. +// +// We deliberately do NOT run splitShellLike across the whole payload +// because the task text is not shell syntax — an apostrophe like +// "what's broken" would throw "Unterminated quoted argument" under a +// shell-style tokenizer. Instead we peel off one flag token at a time +// from the front, stop at the first non-flag token, and treat the rest +// of the stdin blob as a literal string. +// +// Security model: stdin is already shell-insulated by the command +// file's single-quoted heredoc, so no byte in `text` can be executed +// as shell. This parser's only job is to: +// 1. Reject unknown flags so a malicious or buggy caller cannot smuggle +// unsafe arguments through to opencode-companion.mjs. +// 2. Validate the values of value-flags (`--model`, `--agent`) against +// strict character classes / allowlists. +// 3. Translate user-facing routing flags (`--resume`, `--wait`, `--fresh`) +// into companion-native equivalents (or drop them when they are +// documented no-ops). +function parseTaskArgs(text) { + const out = []; + let remaining = text; + let sawModel = false; + let sawFree = false; + + // Consume one whitespace-separated token from the front of `remaining`. + // Returns [token, rest] or null when `remaining` is empty. + function peelToken(source) { + const m = source.match(/^(\S+)(?:\s+([\s\S]*))?$/); + if (!m) return null; + return [m[1], m[2] ?? ""]; + } + + while (remaining.length > 0) { + const peeled = peelToken(remaining); + if (!peeled) break; + const [token, rest] = peeled; + + // --- Boolean flags forwarded as-is --- + if (token === "--background" || token === "--worktree" || token === "--free") { + if (token === "--free") sawFree = true; + out.push(token); + remaining = rest; + continue; + } + + // --- Documented no-ops / routing markers --- + // --wait is a documented alias for "default foreground". The + // companion has no --wait flag, so we drop it here. + // --fresh means "do not add --resume-last", which at this layer is + // also a no-op (we only emit --resume-last when --resume is present). + if (token === "--wait" || token === "--fresh") { + remaining = rest; + continue; + } + + // --- User-facing --resume → companion-native --resume-last --- + if (token === "--resume" || token === "--resume-last") { + out.push("--resume-last"); + remaining = rest; + continue; + } + + // --- Value flag: --model --- + if (token === "--model") { + const valuePeeled = peelToken(rest); + if (!valuePeeled) throw new Error("--model requires a value."); + const [value, afterValue] = valuePeeled; + if (!/^[A-Za-z0-9._/:-]+$/.test(value)) { + throw new Error( + "--model value must match [A-Za-z0-9._/:-]+ (e.g. openrouter/anthropic/claude-haiku-4.5)." + ); + } + sawModel = true; + out.push("--model", value); + remaining = afterValue; + continue; + } + + // --- Value flag: --agent (only build|plan) --- + if (token === "--agent") { + const valuePeeled = peelToken(rest); + if (!valuePeeled) throw new Error("--agent requires a value."); + const [value, afterValue] = valuePeeled; + if (value !== "build" && value !== "plan") { + throw new Error("--agent value must be 'build' or 'plan'."); + } + out.push("--agent", value); + remaining = afterValue; + continue; + } + + // Any other `--`-prefixed token is an unknown flag. Reject rather + // than silently passing it through to opencode-companion.mjs. + if (token.startsWith("--")) { + throw new Error(`Unsupported rescue flag: ${token}`); + } + + // First non-flag token encountered. Everything from here on — this + // token PLUS the rest of the untokenized string — is the task text. + break; + } + + if (sawFree && sawModel) { + throw new Error("--free and --model are mutually exclusive; pick one."); + } + + const taskText = remaining.trim(); + if (taskText) out.push(taskText); + + return out; +} + function parseSetupArgs(text) { const tokens = splitShellLike(text); const out = []; diff --git a/plugins/opencode/scripts/session-lifecycle-hook.mjs b/plugins/opencode/scripts/session-lifecycle-hook.mjs index 7312fd2..9c4a592 100644 --- a/plugins/opencode/scripts/session-lifecycle-hook.mjs +++ b/plugins/opencode/scripts/session-lifecycle-hook.mjs @@ -22,14 +22,11 @@ async function main() { } if (event === "SessionEnd") { - // Reap the plugin-spawned OpenCode server (if idle > 5 min). - try { - await reapServerIfOurs(workspace); - } catch { - // best-effort - } - - // Clean up: check for any orphaned running jobs and mark them as failed + // Clean up orphaned running jobs first. This has to run before + // reapServerIfOurs because the reaper now checks for in-flight + // tracked jobs before killing the server, and stale `running` + // entries from a crashed companion would otherwise keep the server + // alive forever. const state = loadState(workspace); const runningJobs = (state.jobs ?? []).filter((j) => j.status === "running"); @@ -50,6 +47,14 @@ async function main() { } } } + + // Then reap the plugin-spawned OpenCode server (if uptime > 5 min + // AND no in-flight tracked jobs are running against it). + try { + await reapServerIfOurs(workspace); + } catch { + // best-effort + } } } diff --git a/plugins/opencode/skills/opencode-runtime/SKILL.md b/plugins/opencode/skills/opencode-runtime/SKILL.md deleted file mode 100644 index 192caa3..0000000 --- a/plugins/opencode/skills/opencode-runtime/SKILL.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -name: opencode-runtime -description: Internal helper contract for calling the opencode-companion runtime from Claude Code -user-invocable: false ---- - -# OpenCode Runtime - -Use this skill only inside the `opencode:opencode-rescue` subagent. - -Primary helper: -- `node "${CLAUDE_PLUGIN_ROOT}/scripts/opencode-companion.mjs" task ""` - -Execution rules: -- The rescue subagent is a forwarder, not an orchestrator. Its only job is to invoke `task` once and return that stdout unchanged. -- Prefer the helper over hand-rolled `git`, direct OpenCode CLI strings, or any other Bash activity. -- Do not call `setup`, `review`, `adversarial-review`, `status`, `result`, or `cancel` from `opencode:opencode-rescue`. -- Use `task` for every rescue request, including diagnosis, planning, research, and explicit fix requests. -- You may use the `opencode-prompting` skill to rewrite the user's request into a tighter OpenCode prompt before the single `task` call. -- That prompt drafting is the only Claude-side work allowed. Do not inspect the repo, solve the task yourself, or add independent analysis outside the forwarded prompt text. -- Leave `--agent` unset unless the user explicitly requests a specific agent (build or plan). -- Leave model unset by default. Add `--model` only when the user explicitly asks for one, or `--free` when they explicitly ask for a free-tier pick. `--free` and `--model` are mutually exclusive. - -Command selection: -- Use exactly one `task` invocation per rescue handoff. -- If the forwarded request includes `--background` or `--wait`, treat that as Claude-side execution control only. Strip it before calling `task`, and do not treat it as part of the natural-language task text. -- If the forwarded request includes `--model`, pass it through to `task`. -- If the forwarded request includes `--free`, pass it through to `task`. The companion will shell out to `opencode models`, filter for first-party `opencode/*` free-tier entries (`:free` or `-free`), and pick one at random. `--free` is restricted to `opencode/*` because OpenRouter free models have inconsistent tool-use support. -- If the forwarded request includes both `--free` and `--model`, do not invoke `task` — return nothing, because the companion will reject the combination. -- If the forwarded request includes `--agent`, pass it through to `task`. -- If the forwarded request includes `--worktree`, pass it through to `task`. This runs OpenCode in a disposable git worktree so the user can keep or discard the change after the run finishes. -- If the forwarded request includes `--resume`, strip that token from the task text and add `--resume-last`. -- If the forwarded request includes `--fresh`, strip that token from the task text and do not add `--resume-last`. -- `--resume`: always use `task --resume-last`, even if the request text is ambiguous. -- `--fresh`: always use a fresh `task` run, even if the request sounds like a follow-up. - -Safety rules: -- Default to write-capable OpenCode work in `opencode:opencode-rescue` unless the user explicitly asks for read-only behavior. -- Preserve the user's task text as-is apart from stripping routing flags. -- Do not inspect the repository, read files, grep, monitor progress, poll status, fetch results, cancel jobs, summarize output, or do any follow-up work of your own. -- Return the stdout of the `task` command exactly as-is. -- If the Bash call fails or OpenCode cannot be invoked, return nothing. diff --git a/tests/reaper.test.mjs b/tests/reaper.test.mjs new file mode 100644 index 0000000..e791550 --- /dev/null +++ b/tests/reaper.test.mjs @@ -0,0 +1,232 @@ +// Tests for reapServerIfOurs — specifically the in-flight tracked-job +// guard that was added after a background review was killed mid-stream +// by a SessionEnd reap (issue: foreground/background SIGTERM race that +// surfaced as bare `terminated` from undici). +// +// These tests do not touch a real OpenCode server. Instead they: +// +// 1. Spawn a throwaway `sleep` child as the "server" PID so the +// reaper has a live but kill-safe process to target. +// 2. Write server.json and state.json directly under +// CLAUDE_PLUGIN_DATA to set up the tracked state. +// 3. Use a free high port so `isServerRunning` returns false on the +// post-kill verification step. +// +// Tests the matrix: +// - in-flight running job with a live companion PID → reap skipped, +// sleeper still alive +// - no running jobs → reap proceeds, sleeper got SIGTERM +// - stale running job whose companion PID is dead → reap proceeds +// (the stale entry must not block reaping forever) + +import { describe, it, beforeEach, afterEach } from "node:test"; +import assert from "node:assert/strict"; +import fs from "node:fs"; +import os from "node:os"; +import path from "node:path"; +import crypto from "node:crypto"; +import { spawn } from "node:child_process"; + +import { reapServerIfOurs } from "../plugins/opencode/scripts/lib/opencode-server.mjs"; +import { saveState } from "../plugins/opencode/scripts/lib/state.mjs"; + +// PID 999999 is well above pid_max on typical Linux/macOS and is a +// reliably-dead sentinel. +const DEAD_PID = 999_999; + +// Free high port nothing listens on. isServerRunning(TEST_HOST, this) +// should return false so the reaper's post-kill verification can +// conclude the port is empty. +const UNUSED_PORT = 24_099; +const TEST_HOST = "127.0.0.1"; + +function workspaceHash(workspacePath) { + return crypto.createHash("sha256").update(workspacePath).digest("hex").slice(0, 16); +} + +function serverStatePathFor(pluginData, workspace) { + return path.join(pluginData, "state", workspaceHash(workspace), "server.json"); +} + +function writeServerState(pluginData, workspace, state) { + const p = serverStatePathFor(pluginData, workspace); + fs.mkdirSync(path.dirname(p), { recursive: true }); + fs.writeFileSync(p, JSON.stringify(state, null, 2)); +} + +function readServerState(pluginData, workspace) { + const p = serverStatePathFor(pluginData, workspace); + if (!fs.existsSync(p)) return {}; + return JSON.parse(fs.readFileSync(p, "utf8")); +} + +function spawnSleeper() { + // 30s is long enough that the reaper's 1s post-kill poll will catch + // a live process in the in-flight test, but if the kill fires the + // sleeper should exit well within that window. + const proc = spawn("sleep", ["30"], { stdio: "ignore", detached: true }); + proc.unref(); + return proc; +} + +function isPidAlive(pid) { + try { + process.kill(pid, 0); + return true; + } catch { + return false; + } +} + +async function waitUntilDead(pid, timeoutMs = 3_000) { + const deadline = Date.now() + timeoutMs; + while (Date.now() < deadline) { + if (!isPidAlive(pid)) return true; + await new Promise((r) => setTimeout(r, 50)); + } + return !isPidAlive(pid); +} + +describe("reapServerIfOurs in-flight guard", () => { + let workspace; + let pluginData; + let previousPluginData; + let sleeper; + + beforeEach(() => { + workspace = fs.mkdtempSync(path.join(os.tmpdir(), "opencode-reaper-ws-")); + pluginData = fs.mkdtempSync(path.join(os.tmpdir(), "opencode-reaper-data-")); + previousPluginData = process.env.CLAUDE_PLUGIN_DATA; + process.env.CLAUDE_PLUGIN_DATA = pluginData; + sleeper = spawnSleeper(); + }); + + afterEach(async () => { + if (sleeper && sleeper.pid && isPidAlive(sleeper.pid)) { + try { process.kill(sleeper.pid, "SIGKILL"); } catch {} + } + if (previousPluginData == null) delete process.env.CLAUDE_PLUGIN_DATA; + else process.env.CLAUDE_PLUGIN_DATA = previousPluginData; + try { fs.rmSync(workspace, { recursive: true, force: true }); } catch {} + try { fs.rmSync(pluginData, { recursive: true, force: true }); } catch {} + }); + + it("skips reap when a live tracked job is still running", async () => { + // Server tracked, uptime well over the 5-min threshold. + writeServerState(pluginData, workspace, { + lastServerPid: sleeper.pid, + lastServerStartedAt: Date.now() - 10 * 60 * 1000, + }); + // In-flight job pointing at the current test process (which is + // definitely alive) so the in-flight guard triggers. + saveState(workspace, { + config: {}, + jobs: [ + { + id: "inflight-1", + type: "task", + status: "running", + pid: process.pid, + createdAt: new Date().toISOString(), + updatedAt: new Date().toISOString(), + }, + ], + }); + + const reaped = await reapServerIfOurs(workspace, { + host: TEST_HOST, + port: UNUSED_PORT, + }); + + assert.equal(reaped, false, "reaper should skip when in-flight jobs exist"); + assert.equal(isPidAlive(sleeper.pid), true, "sleeper must still be alive"); + + // Server state must not be cleared when reap is skipped. + const after = readServerState(pluginData, workspace); + assert.equal(after.lastServerPid, sleeper.pid); + }); + + it("proceeds when no running jobs reference the server", async () => { + writeServerState(pluginData, workspace, { + lastServerPid: sleeper.pid, + lastServerStartedAt: Date.now() - 10 * 60 * 1000, + }); + // No running jobs — only a completed one, which must not block. + saveState(workspace, { + config: {}, + jobs: [ + { + id: "done-1", + type: "task", + status: "completed", + pid: null, + createdAt: new Date().toISOString(), + updatedAt: new Date().toISOString(), + }, + ], + }); + + const reaped = await reapServerIfOurs(workspace, { + host: TEST_HOST, + port: UNUSED_PORT, + }); + + // We asked the reaper to SIGTERM the sleeper. Give it a moment to die. + const died = await waitUntilDead(sleeper.pid); + assert.equal(died, true, "sleeper must have been killed by the reaper"); + assert.equal(reaped, true, "reaper must report success when nothing was in-flight"); + + // Server state must be cleared after a successful reap. + const after = readServerState(pluginData, workspace); + assert.equal(after.lastServerPid, null); + assert.equal(after.lastServerStartedAt, null); + }); + + it("treats a stale `running` job with a dead PID as not-in-flight", async () => { + writeServerState(pluginData, workspace, { + lastServerPid: sleeper.pid, + lastServerStartedAt: Date.now() - 10 * 60 * 1000, + }); + // A previous companion crashed without marking its job failed. + // The stale entry must not permanently block reaping. + saveState(workspace, { + config: {}, + jobs: [ + { + id: "stale-1", + type: "task", + status: "running", + pid: DEAD_PID, + createdAt: new Date().toISOString(), + updatedAt: new Date().toISOString(), + }, + ], + }); + + const reaped = await reapServerIfOurs(workspace, { + host: TEST_HOST, + port: UNUSED_PORT, + }); + + const died = await waitUntilDead(sleeper.pid); + assert.equal(died, true, "sleeper must be killed — the stale job must not block"); + assert.equal(reaped, true); + }); + + it("does not reap when uptime is below the 5-minute threshold", async () => { + writeServerState(pluginData, workspace, { + lastServerPid: sleeper.pid, + // Fresh server — uptime < 5 min. + lastServerStartedAt: Date.now() - 30 * 1000, + }); + saveState(workspace, { config: {}, jobs: [] }); + + const reaped = await reapServerIfOurs(workspace, { + host: TEST_HOST, + port: UNUSED_PORT, + }); + + assert.equal(reaped, false); + assert.equal(isPidAlive(sleeper.pid), true, "fresh server must not be killed"); + }); +}); diff --git a/tests/safe-command-task.test.mjs b/tests/safe-command-task.test.mjs new file mode 100644 index 0000000..323e0ce --- /dev/null +++ b/tests/safe-command-task.test.mjs @@ -0,0 +1,205 @@ +// Unit tests for safe-command.mjs `task` bridge parsing. +// +// Only exercises the pure argv-building layer (`buildForwardArgs`), not +// the full spawn chain. The bridge's security contract is: given raw +// stdin text, it (1) rejects unknown flags, (2) validates value-flags, +// (3) translates routing aliases, and (4) forwards everything else as a +// single verbatim task-text positional argument. If the output argv +// matches expectations for a given input, the bridge is safe — no shell +// layer exists between here and opencode-companion.mjs. + +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import { buildForwardArgs } from "../plugins/opencode/scripts/safe-command.mjs"; + +function task(input) { + return buildForwardArgs("task", input); +} + +describe("safe-command task bridge — happy paths", () => { + it("forwards plain task text as a single positional arg", () => { + assert.deepEqual(task("fix the bug in src/foo.js"), ["task", "fix the bug in src/foo.js"]); + }); + + it("peels --background off the front and keeps task text literal", () => { + assert.deepEqual( + task("--background investigate why sessions leak"), + ["task", "--background", "investigate why sessions leak"] + ); + }); + + it("peels multiple boolean flags in order", () => { + assert.deepEqual( + task("--background --worktree --free do the thing"), + ["task", "--background", "--worktree", "--free", "do the thing"] + ); + }); + + it("translates --resume to --resume-last", () => { + assert.deepEqual( + task("--resume keep going"), + ["task", "--resume-last", "keep going"] + ); + }); + + it("accepts --resume-last directly and emits it once", () => { + assert.deepEqual( + task("--resume-last continue from where we left off"), + ["task", "--resume-last", "continue from where we left off"] + ); + }); + + it("strips --wait (no-op alias for foreground)", () => { + assert.deepEqual( + task("--wait diagnose this"), + ["task", "diagnose this"] + ); + }); + + it("strips --fresh (routing marker — absence of --resume-last already conveys 'fresh')", () => { + assert.deepEqual( + task("--fresh start clean"), + ["task", "start clean"] + ); + }); + + it("forwards --model value after basic validation", () => { + assert.deepEqual( + task("--model openrouter/anthropic/claude-haiku-4.5 fix the bug"), + ["task", "--model", "openrouter/anthropic/claude-haiku-4.5", "fix the bug"] + ); + }); + + it("forwards --agent build and --agent plan", () => { + assert.deepEqual( + task("--agent build implement feature"), + ["task", "--agent", "build", "implement feature"] + ); + assert.deepEqual( + task("--agent plan outline the approach"), + ["task", "--agent", "plan", "outline the approach"] + ); + }); + + it("combines many flags and preserves task text literally", () => { + assert.deepEqual( + task("--background --worktree --resume --model provider/model-x --agent build refactor the worker"), + [ + "task", + "--background", + "--worktree", + "--resume-last", + "--model", + "provider/model-x", + "--agent", + "build", + "refactor the worker", + ] + ); + }); +}); + +describe("safe-command task bridge — task text is shell-insulated", () => { + it("passes apostrophes through literally", () => { + assert.deepEqual( + task("--background what's broken here"), + ["task", "--background", "what's broken here"] + ); + }); + + it("passes shell metacharacters through literally as one string", () => { + const taskText = "fix `rm -rf /tmp/x`; echo $(whoami) > /etc/passwd | tee /dev/null & \"oops\""; + assert.deepEqual(task(taskText), ["task", taskText]); + }); + + it("passes newlines through literally", () => { + const taskText = "line one\nline two\nline three"; + assert.deepEqual(task(taskText), ["task", taskText]); + }); + + it("passes leading task text that contains quotes without tokenizing", () => { + const taskText = `he said "this is broken" and she said 'also broken'`; + assert.deepEqual(task(taskText), ["task", taskText]); + }); + + it("preserves internal whitespace in task text after the first non-flag token", () => { + assert.deepEqual( + task("--worktree fix spacing here"), + ["task", "--worktree", "fix spacing here"] + ); + }); + + it("does not re-tokenize dashes in the middle of task text", () => { + assert.deepEqual( + task("check --verbose flag handling in the cli"), + ["task", "check --verbose flag handling in the cli"] + ); + }); + + it("handles task text that starts with a hyphen after a known flag", () => { + // `-x` isn't a recognized flag (not `--`-prefixed) so it becomes + // part of the task text verbatim. + assert.deepEqual( + task("--background -x is not a flag, keep it in the task text"), + ["task", "--background", "-x is not a flag, keep it in the task text"] + ); + }); +}); + +describe("safe-command task bridge — rejection paths", () => { + it("rejects unknown double-dash flags", () => { + assert.throws( + () => task("--evil-flag do something"), + /Unsupported rescue flag: --evil-flag/ + ); + }); + + it("rejects --model without a value", () => { + assert.throws(() => task("--model"), /--model requires a value/); + }); + + it("rejects --model with an invalid value", () => { + assert.throws( + () => task("--model bad;value do things"), + /--model value must match/ + ); + }); + + it("rejects --agent without a value", () => { + assert.throws(() => task("--agent"), /--agent requires a value/); + }); + + it("rejects --agent with a disallowed value", () => { + assert.throws( + () => task("--agent evil run unsafe things"), + /--agent value must be 'build' or 'plan'/ + ); + }); + + it("rejects --free combined with --model", () => { + assert.throws( + () => task("--free --model provider/model fix it"), + /--free and --model are mutually exclusive/ + ); + assert.throws( + () => task("--model provider/model --free fix it"), + /--free and --model are mutually exclusive/ + ); + }); + + it("returns just the subcommand when stdin is empty", () => { + // A completely empty payload is allowed at the bridge layer. + // opencode-companion.mjs's handleTask is responsible for rejecting + // no-task-text invocations; that's covered by its own tests. + assert.deepEqual(task(""), ["task"]); + }); +}); + +describe("safe-command task bridge — command dispatch", () => { + it("still rejects unknown top-level subcommands", () => { + assert.throws( + () => buildForwardArgs("not-a-command", ""), + /Unsupported safe command: not-a-command/ + ); + }); +});