Skip to content

fix(terminal): close SIGKILL race for #217 + preserve scrollback on worktree sleep#1454

Merged
Jinwoo-H merged 2 commits intomainfrom
Jinwoo-H/terminal-history-minimal-fix
May 6, 2026
Merged

fix(terminal): close SIGKILL race for #217 + preserve scrollback on worktree sleep#1454
Jinwoo-H merged 2 commits intomainfrom
Jinwoo-H/terminal-history-minimal-fix

Conversation

@Jinwoo-H
Copy link
Copy Markdown
Contributor

@Jinwoo-H Jinwoo-H commented May 5, 2026

Summary

Two independent terminal-history bugs fixed in one minimal change set (~470 LoC across 28 files). Implements DESIGN_DOC_TERMINAL_HISTORY_FIX_V2.md, which replaces PR #1375 (V1 deterministic-id approach, ~1650 LoC).

Bug 1 — Issue #217: terminal scrollback lost on force-quit/crash

Root cause. Two debounce layers (~450 ms total) sit between "renderer learns the new ptyId" and "ptyId on disk in `orca-data.json`". A SIGKILL inside that window orphans the mapping, so cold-restore can't find the daemon's history dir.

Fix. When `provider.spawn` resolves inside the `pty:spawn` IPC handler, the main process now patches the workspace session in memory and triggers a sync flush via `Store.persistPtyBinding` before returning the spawn result to the renderer. Renderer cannot observe spawn-success without the binding already being durable on disk.

Bug 2 — orca-internal#235: worktree sleep wipes scrollback

Root cause. `shutdownWorktreeTerminals` → `pty:kill` → `DaemonPtyAdapter.shutdown` calls `historyManager.removeSession(id)` for every kill, deleting the on-disk `terminal-history//` directory. Sleep was meant to be reversible.

Fix.

  • Add `opts: { immediate?, keepHistory? }` to `IPtyProvider.shutdown`. Sleep passes `keepHistory: true`; daemon adapter skips both `removeSession` and the killed-session tombstone (wake legitimately reattaches).
  • Add `keepIdentifiers` flag to `shutdownWorktreeTerminals`. When set, preserves `tab.ptyId`, `ptyIdsByLeafId`, `lastKnownRelayPtyIdByTabId`, `runtimePaneTitlesByTabId`, and `pendingCodexPaneRestartIds` (per-field disposition table in design doc §3.3.c).
  • Capture serializer buffers into `buffersByLeafId` before issuing `pty.kill`. Load-bearing for SSH (no on-disk history dir on the relay) and defense-in-depth for local. `shutdownBufferCaptures` is now `Map<tabId, fn>` so sleep iterates only the affected worktree.
  • Plumb `keepHistory` through the SSH provider and relay (additive wire-format change; older relays ignore unknown fields).

Why V2 over V1 (#1375)

Property V1 (deterministic-id + reaper) V2 (sync flush + keepHistory)
Race window after spawn 0 ms ~5 ms (sync flush)
Survives SIGKILL inside flush yes no (theoretical)
New code ~1650 LoC ~470 LoC
Orphan-dir lifecycle needs reaper existing `removeSession`
Cross-restart `pane:N` collision needs counter advance none
Sleep regression fix not addressed included

The 5 ms window requires SIGKILL between "main started flush" and "main completed flush" — sub-event-loop-tick on a laptop SSD. If post-ship telemetry shows non-zero rate, V1 is the next escalation.

Design review

3 review iterations via `/auto-design-review-fix`:

  • iter 1: 18 findings → 4 P1 validated, 5 severity-adjusted, 9 invalidated (50% — healthy adversarial review)
  • iter 2: 2 P1 (F-SSH sleep-time capture, F1 preload signatures) → fixed
  • iter 3: 3 P1 (App.tsx iteration site, ordering invariant, per-field disposition) → fixed
  • iter 4: 🟢 READY

Files changed (28)

Provider signature (positional bool → opts object): `types.ts`, `daemon-pty-adapter`, `daemon-pty-provider`, `daemon-pty-router`, `local-pty-provider`, `ssh-pty-provider`. IPC + persistence: `pty.ts` (spawn + kill + handler signature), `pty-management.ts`, `worktree-teardown.ts`, `persistence.ts` (new `persistPtyBinding`), `attach-main-window-services.ts`. Preload: `index.ts`, `api-types.ts` (additive `tabId`/`leafId` on spawn, `keepHistory` on kill). Renderer: `pty-dispatcher.ts`, `pty-transport.ts`, `pty-connection.ts`, `TerminalPane.tsx` (Set→Map), `App.tsx` (`.values()`), `store/slices/terminals.ts` (keepIdentifiers + per-field disposition + buffer capture), `sleep-worktree-flow.ts`. Tests: 4 test files updated for the new shutdown signature; SSH-provider test adds keepHistory forwarding assertion.

Verification

  • `pnpm run typecheck` (node + cli + web) — clean
  • `pnpm run lint` — 0 errors (6 pre-existing warnings unchanged)
  • Targeted test suite — 239/239 pass (pty, daemon-adapter, daemon-provider, local-provider, ssh-provider, pty-management, worktree-teardown, sleep-worktree-flow, persistence)
  • Full suite — 17 unrelated failures (better-sqlite3 NODE_MODULE_VERSION on Node 25 + env-dependent shell-ready tests + git ENOENT in orca-runtime); confirmed pre-existing on `main`

Closes

Made with Orca 🐋

Jinwoo-H and others added 2 commits May 5, 2026 16:54
…orktree sleep

Two independent terminal-history bugs fixed in one minimal change set
(~470 LoC across 28 files), per DESIGN_DOC_TERMINAL_HISTORY_FIX_V2.md.

Issue #217 — terminal scrollback lost on force-quit/crash:
  pty:spawn now sync-flushes the (worktreeId, tabId, leafId → ptyId)
  binding via Store.persistPtyBinding before returning, so the renderer
  cannot observe a spawn-success without the binding already durable on
  disk. Closes the ~450ms race between provider.spawn resolving and the
  renderer's debounced session writer that previously orphaned the
  daemon's history dir on SIGKILL inside that window.

Worktree sleep regression (orca-internal#235) — scrollback wiped on wake:
  Adds an opts object to IPtyProvider.shutdown carrying keepHistory and
  threads it from the renderer sleep flow through pty:kill IPC, the
  daemon adapter, the SSH provider, and the relay. When set, the daemon
  adapter skips both historyManager.removeSession and the killed-session
  tombstone — wake legitimately reattaches.

  shutdownWorktreeTerminals also gains a keepIdentifiers flag, which
  preserves tab.ptyId, ptyIdsByLeafId, lastKnownRelayPtyIdByTabId,
  runtimePaneTitlesByTabId, and pendingCodexPaneRestartIds across sleep,
  and triggers a per-tab serializer-buffer capture into
  buffersByLeafId before issuing pty.kill. The capture is load-bearing
  for SSH (no on-disk history dir on the relay) and defense-in-depth
  for local. shutdownBufferCaptures becomes Map<tabId, fn> so sleep can
  iterate only the affected worktree's tabs.

Provider signature change ripples to 11 call sites + 4 test files
(positional boolean → opts object). Schema additions are additive at
the IPC and relay layers; older relays ignore unknown fields.

Co-authored-by: Orca <help@stably.ai>
…-runtime test

CI-revealed issues from PR #1454:

1. createTerminalSlice import cycle. The terminals store slice imported
   shutdownBufferCaptures from TerminalPane.tsx, which (transitively)
   imports the store, breaking createTerminalSlice at module-init time
   and crashing every test that hydrates the store. Move the registry
   into a leaf module (shutdown-buffer-captures.ts) with no imports;
   TerminalPane and App.tsx now import from there directly.

2. orca-runtime test missed in the shutdown signature sweep — assertion
   still expected the old positional boolean instead of the opts object.

Co-authored-by: Orca <help@stably.ai>
@Jinwoo-H Jinwoo-H merged commit 30b6aaf into main May 6, 2026
2 checks passed
@Jinwoo-H Jinwoo-H deleted the Jinwoo-H/terminal-history-minimal-fix branch May 6, 2026 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant