V0.1.3/m2/publish and cross os verify#10
Open
Vedansi18 wants to merge 53 commits into
Open
Conversation
Adds the new src/agents/ module: four adapter interfaces (HookAdapter, VSCodeExtensionAdapter, CLIWrapAdapter, BrowserExtensionAdapter), an in-process registry (registerAdapter, detectAll, getAdapter), and an empty index.ts placeholder for future adapter registrations. Unit tests in registry.test.ts cover the registry behaviour. Adds src/cli/commands/install.snapshot.test.ts plus its generated baseline snapshot. The snapshot captures current installAction output (settings.json bytes + stdout) with $HOME and platform-dependent strings normalised so the snapshot is portable across machines. This is the zero-diff safety net for M1 Branch 2 (claude-code refactor): that branch must keep this snapshot byte-identical. No existing source code is modified. Per dev plan §1.6 in reviewduel-submodule. Branch: v0.1.3/m1/foundation-scaffold Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the six Claude Code hook helpers (getClaudeSettingsPath,
buildHookCommand, buildStopHookCommand, buildHookEntry, writeHookEntry,
removeHookEntry) from src/cli/commands/install.ts to
src/agents/adapters/claude-code.ts. Function bodies are byte-identical.
install.ts re-exports them so existing imports (and install.test.ts)
continue to work unchanged.
Adds claudeCodeAdapter (HookAdapter) that wraps the moved functions and
self-registers via src/agents/index.ts side-effect import.
installAction's Claude Code branch in the for-loop now delegates to the
adapter via getAdapter('claude-code').install(ctx).
Adds optional settingsPath override to InstallContext so callers can
decouple the target file path from ctx.home — preserves the pre-refactor
pattern where paths.claudeSettings was passed independently of homedir()
(used by install.test.ts to inject custom tmp paths without stubbing
HOME). Without this, tests would write hook entries to the real
~/.claude/settings.json instead of their tmp dir.
Adds src/agents/adapters/claude-code.test.ts (18 unit tests) covering
the moved helpers + adapter contract (detect, settingsPath, buildHooks,
install, uninstall) + the settingsPath override behaviour.
Zero-diff invariant preserved: install snapshot from M1 Branch 1 remains
byte-identical. All 177 relevant tests pass. typecheck clean.
Branch: v0.1.3/m1/claude-code-refactor (off v0.1.3/m1/foundation-scaffold,
which sits on upstream/user-experience-improvements-sub-7).
Per dev plan §3.0 in reviewduel-submodule.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The original install.ts comment was a single line: // Register the advisory pipeline hook (separate from MCP — different file) The previous M1/B2 commit (d93852e) expanded this into a four-line comment explaining the adapter delegation. Per team feedback, comments on existing pre-refactor code should be kept verbatim — the §1.5 strict zero-diff guarantee includes comments on existing code. No behavioural change. Tests + snapshot unchanged (177/177 pass, install snapshot remains byte-identical with M1 Branch 1's baseline). Branch: v0.1.3/m1/claude-code-refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 1 of Milestone M2 (v0.1.3/m2/extension-skeleton). Establishes the
src/ext-vscode/ sub-package with an esbuild-driven build pipeline
(ESM source -> CJS bundle for the VS Code host), activates on
onStartupFinished, and ships the four scoped modules:
M1 - Skeleton: package.json (activationEvents, activity-bar container +
placeholder view backed by viewsWelcome so the icon actually renders),
tsconfig.json, esbuild.config.mjs, src/extension.ts entrypoint.
M5 - IPC stub: src/ipc.ts. spawnAuto(prompt, sessionId) and
spawnStop(sessionId) spawn the nexpath CLI as subprocesses and parse
the decision-session JSON payload from stdout, with typed errors
(NexpathBinaryNotFoundError, NexpathMalformedPayloadError) and
configurable binary-path resolution
(opts.binaryPath -> NEXPATH_BIN env -> 'nexpath' on PATH).
The exact stdin envelope vs. Layer C input contract is intentionally
a stub here; Branch 4 (cursor-windsurf-adapters) finalises it.
M11 - Onboarding: src/onboarding.ts. First-launch consent toast persists
the user's choice to globalState; on macOS, additionally shows a
Full-Disk-Access guidance toast that deep-links to the System
Settings privacy pane.
M12 - Icon: media/icon.svg. Y-fork (branching path) representing
"next path" decision points; monochrome currentColor, scalable.
25 unit tests co-located alongside source (8 onboarding, 11 ipc, 6 extension),
runnable via root vitest with vi.mock('vscode') stubs. Sub-package has its
own tsconfig + package-lock; root tsconfig now excludes src/ext-vscode/ so
each side owns its TS build. Both root and sub-package tsc --noEmit are
clean. Full root test suite: 1851 passing + 18 pre-existing unrelated
TtySelectFn Windows-sim failures (carried forward from dev plan §3.0).
Deferred (flagged for follow-up, not blockers for this branch):
- 5 moderate npm-audit warnings in the esbuild -> vite -> vitest dev chain
(dev-only; will be addressed during M5 hardening).
- IPC stdin envelope contract: real wiring lands in Branch 4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 2 of Milestone M2 (v0.1.3/m2/chat-history-capture). Stacked on M2 Branch 1 (commit 879ed5e). Adds the three scoped modules: M2 - chat-history-watcher.ts: fs.watch on Cursor's state.vscdb and Windsurf's ~/.codeium/windsurf/ dir, debounced (default 250ms), reads ItemTable via injectable readItemTableFn (sql.js by default), diffs against seenSignatures, emits {prompt, rawSessionId, capturedAt, sourcePath, extractorId}. Dependency-injectable throughout (watchFn, readFileFn, readItemTableFn, nowFn) so the unit tests run without sql.js or real fs.watch. M3 - extractors/: four per-version row decoders implementing the ChatHistoryExtractor contract from chat-history-types.ts. - cursor-v2024-q4 (aiService.prompts global key, pre-Composer) - cursor-v2025-q1 (composerData.composerData, Composer era) - cursor-v2025-q2 (cursorAIChatService.chatHistory.<tabId> per-tab keys, current) - windsurf (cascade.* placeholder; real Windsurf decoding lands in Branch 4 alongside windsurfAdapter) Each Cursor extractor handles both `role`/`type` and `content`/`text` field variants seen across minor versions. All four are TODO-flagged for verification against real dumps before Branch 6 publishes — scripts/dump-cursor-state.ts (below) captures those dumps. M4 - pickExtractor in extractors/index.ts: prefix-match each extractor's fingerprintKeys against the observed ItemTable keys, pick the highest match count (ties broken by registry order = newest first). Returns FingerprintResult; unknown schemas surface observedSampleKeys for the "schema unknown" toast hook. scripts/dump-cursor-state.ts: dev-only helper (npx tsx) for capturing state.vscdb fixtures from a machine with Cursor installed. Filters to chat-related key prefixes, optional --redact for sensitive content. Outputs to src/ext-vscode/test-fixtures/state-vscdb-samples/. Sub-package additions: - dependencies: sql.js ^1 (runtime; loaded via dynamic import so wasm boot is lazy). Marked external in esbuild so the .vsix ships node_modules/sql.js rather than inlining it. - devDependencies: tsx ^4 (for running the dump script). 57 new unit tests (sub-package totals: 82 passing across 9 files): cursor-v2024-q4 9 tests cursor-v2025-q1 10 tests cursor-v2025-q2 11 tests windsurf 4 tests extractors/index 12 tests chat-history-watcher 11 tests Verification: root tsc --noEmit clean; sub-package tsc --noEmit clean; sub-package vitest 82/82 pass; full root test suite 1908 passing + 18 pre-existing TtySelectFn Windows-sim failures (carried forward from M1 3.0, unrelated); esbuild bundle still builds out/extension.js. Deferred to follow-up (flagged, not blockers): - Real-dump verification of all 4 extractors (use dump-cursor-state.ts on machines with each Cursor version installed; replace TODO comments in extractors with fixture-driven regression tests). - Windsurf JSON-file decoder (Branch 4). - Wiring the watcher into extension.ts activate() (Branch 3 webview-ui or Branch 4 adapters — depends on UI surface integration). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real-machine inspection on Cursor 3.4.20 (2026-05-15) surfaced three issues with the Branch 2 extractor designs. This commit fixes the verifiable ones, captures redacted fixtures, and documents the still- unknown bits for the next round. Issue 1 — SQLite WAL mode. The dump script previously used sql.js, which only reads the buffer of the main `.vscdb` file. Live Cursor writes go to the sibling `.vscdb-wal` (185 KB while the main file was 4 KB), so sql.js saw "no such table: ItemTable" even though the table exists. Fix: switched the dump script to better-sqlite3 (native, WAL-aware). Copies main + wal + shm siblings to a tmp staging dir before reading so the live Cursor write path is never touched, then runs `PRAGMA wal_checkpoint(TRUNCATE)` on the staged copy for consistency. The PRODUCTION watcher in `chat-history-watcher.ts` still uses sql.js via dynamic import; the same WAL problem will surface when Branch 4 wires the watcher live. Flagged for Branch 4 design — options are: (a) switch the watcher to better-sqlite3 (native binding in .vsix), or (b) implement copy + checkpoint via sql.js. Out of scope for B2. Issue 2 — `cursor-v2025-q1` extractor's fingerprint key was wrong. Community docs said `composerData.composerData`; Cursor 3.4.20 actually uses `composer.composerData`. Updated the key in both the extractor and its tests + the fingerprint test. Open finding: the `composer.composerData` value on a chat-less Cursor 3.4.20 workspace DB is metadata only (selectedComposerIds, migration flags) — not the conversation messages this extractor's decodeRow logic parses for. Logic falls through cleanly (returns [] when the expected `allComposers` field is absent) and the JSDoc now documents that the real Composer message storage location is still TBD and needs a post-chat snapshot to confirm. Issue 3 — `cursor-v2025-q2` extractor's fingerprint prefix (`cursorAIChatService.chatHistory.`) was NOT observed on Cursor 3.4.20. The extractor still ships (in case older versions use it) but the JSDoc now flags this as unverified and points to the dump script for capturing a real fixture before Branch 6 ships. Dump script additions: - Discovers ALL state.vscdb under Cursor's config tree (global + per-workspace) — chat messages live in the workspace DB, not global. - Dumps both `ItemTable` (filtered to chat-related key prefixes) AND `cursorDiskKV` (Cursor 3.x's parallel KV table; currently empty but may hold Composer messages once chats happen). - One output JSON per discovered DB; suffixed with `global` or `workspace-<id>` for traceability. - `--redact` replaces string values > 8 chars with same-length asterisks. Dependencies: - Added better-sqlite3 ^11 + @types/better-sqlite3 ^7 as devDependencies in the sub-package. Dev-only — the production extension bundle is unaffected. Captured fixtures (redacted) — all three DBs from a chat-less Cursor 3.4.20 session, committed for regression testing: - cursor-3-4-20-initial-global.json (9 rows) - cursor-3-4-20-initial-workspace-1778826246907.json (7 rows) - cursor-3-4-20-initial-workspace-empty-window.json (2 rows) Verification: - Sub-package tsc --noEmit clean. - Sub-package vitest 82/82 pass. - Root tsc --noEmit clean. - Full root test suite 1908 passing + 18 pre-existing TtySelectFn carry-forward. Next step (manual, user-driven): submit a real prompt in Cursor's Ask mode AND in Composer mode, then re-run the dump script to capture a chat-bearing snapshot. The new keys / tables that appear will pin down the Composer-mode message storage location, and a follow-up commit will finalise the extractor decode logic against that real data. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the unit-test audit gap surfaced for M2 Branch 2. The dump script
had real business logic (`redactValue`, `shouldKeepItemTable`,
`parseArgs`, `cursorConfigRoot`, `discoverAllStateVscdb`) with zero test
coverage — `redactValue` in particular has data-leak consequences if
buggy.
Extracted the pure / near-pure helpers into a new module:
- `src/cursor-state-dump-helpers.ts` — lives under tsconfig rootDir
so it's typechecked by the sub-package's main `tsc --noEmit` and
auto-picked-up by vitest. Re-exports `KEEP_ITEMTABLE_PREFIXES`,
`shouldKeepItemTable`, `redactValue`, `cursorConfigRoot`,
`discoverAllStateVscdb` (with injectable fs helpers), and
`parseArgs` (returns a tagged-union result instead of calling
`process.exit`, so the error paths are testable).
Co-located tests: `src/cursor-state-dump-helpers.test.ts` — 28 tests
covering:
- `shouldKeepItemTable`: each default prefix matched, unrelated keys
dropped, custom prefix lists, prefix-not-exact match.
- `redactValue`: short-string preservation, long-string redaction,
nested object/array recursion, non-string value preservation, bulk
redact for non-JSON input, JSON-string root, exact 9-char boundary.
- `cursorConfigRoot`: linux / darwin / win32 / unknown-platform paths
and APPDATA fallback.
- `discoverAllStateVscdb`: empty tree, global-only, global + multiple
workspaces, skip workspace dirs missing the DB, injectable fs.
- `parseArgs`: required `--name`, optional `--src` / `--redact`,
`--help` / `-h` signal, missing-value rejection, unknown-argument
rejection.
Script entry-point `scripts/dump-cursor-state.ts` now imports from
`../src/cursor-state-dump-helpers.js` and retains only the I/O
orchestration (file copy to tmp staging dir, better-sqlite3 read, fixture
write). Behaviour is byte-for-byte unchanged — verified by re-running
against the live machine and producing identical row counts to the
previous commit (`3794bc3`).
Sub-package totals:
- Test files: 10 (was 9)
- Tests: 110 passing (was 82) — +28 helpers tests
- Sub-package tsc --noEmit clean
- Root tsc --noEmit clean
- Full root suite: 1936 passing + 18 pre-existing TtySelectFn
Windows-sim failures (M1 §3.0 carry-forward, unrelated)
Closes the only remaining audit gap for M2 Branch 2. No further unit-test
work pending; per the auto-commit rule the branch is now closed pending
push.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 3 of Milestone M2 (v0.1.3/m2/webview-ui). Stacked on M2 Branch 1 (commit 879ed5e) — does NOT depend on M2 B2's watcher, only on B1's skeleton + the DecisionSessionPayload type from ipc.ts. Delivers the three scoped modules: M6 — WebviewViewProvider: src/webview/view-provider.ts. NexpathDecisionSessionViewProvider implements vscode.WebviewViewProvider for the nexpath.status activity-bar view. resolveWebviewView wires webview.options (enableScripts + localResourceRoots), sets initial HTML, registers onDidReceiveMessage + onDidDispose. publishPayload stores the payload, updates the HTML, and calls webviewView.show(true) for the auto-reveal UX matching architecture rev 2 §4. Payload survives view dispose/re-show. Injectable onSelect dependency for tests. Exposes getCurrentPayload() + handleMessage() for direct message-routing tests. M7 — HTML template: src/webview/html.ts. renderDecisionSessionHtml(payload, opts) — pure function, no I/O. Returns the full self-contained HTML for the webview. Two modes: empty/watching state (no scripts, just "Nexpath is active…") and populated state (advisory + numbered option buttons + dismiss). CSP: default-src 'none' with nonce-scoped scripts. All user-controlled strings HTML-escaped. Theming via --vscode-* CSS variables so the UI inherits Cursor/Windsurf's theme. Tests verify both states, nonce handling, HTML escaping (incl. <script> + onerror= injection attempts), and empty-options array. M8 — Prompt injection: src/webview/prompt-injection.ts. handleOptionSelection writes the selected option text to the system clipboard via vscode.env.clipboard.writeText + shows a non-modal info toast directing the user to paste. This is the ONLY reliable path — VS Code text-editing APIs target editor documents, not the host's (Cursor's) chat input panel (dev plan §2.4). Branch 4 may discover a Cursor-specific command that lets us write directly; until then clipboard + toast is the primary path. Injectable deps for tests. extension.ts updates: - Registers the view provider on activate via vscode.window.registerWebviewViewProvider(VIEW_ID, instance). - Pushes the registration disposable onto context.subscriptions for cleanup on deactivate. - Holds the provider at module scope; exposes via getViewProvider() so Branch 4's adapter wiring can publish payloads. - View provider registration runs BEFORE onboarding so the icon shows immediately on activation, even while consent toasts are open. - Onboarding errors still swallowed (per existing B1 behaviour). package.json updates: - nexpath.status view now declares "type": "webview" (was implicit tree). - viewsWelcome entry removed — webview-type views render their own empty state from inside the webview HTML, not via viewsWelcome. The empty state in renderDecisionSessionHtml replaces it. 38 new unit tests: - html.test.ts: 13 (escapeHtml + empty state + populated state + nonce + HTML escaping in advisory/options + empty options array) - view-provider.test.ts: 14 (VIEW_ID + resolveWebviewView × 4 + publishPayload × 3 + clearPayload + handleMessage × 5) - prompt-injection.test.ts: 6 (clipboard write + toast + error paths + DI + empty string) - extension.test.ts: +5 (registration test + subscriptions push + getViewProvider + onboarding-rejects-but-view-still-registered + the deactivate clears viewProvider) Sub-package totals at branch HEAD: 63 tests across 6 files (was 25 in B1, +38 here). Root tsc + sub-package tsc clean. Full root test suite 1889 passing + 18 pre-existing TtySelectFn carry-forward (unrelated). Esbuild bundle grew from 3.4 KB → 11.0 KB (includes the new webview modules + their CSS template strings). Deferred (flagged, not blockers for this branch): - Pre-prompt blocking on Cursor/Windsurf — current architecture only shows guidance after the host sends the prompt. Pre-send blocking would need a keybinding hijack (architecture doc §11 open question 7). - Cursor-specific "write to chat input" command — discover in Branch 4 if it exists, otherwise clipboard + toast remains the only path. - E2E test against a real Cursor instance — Branch 5 (smoke-test) gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two refinements after cross-confirmation review against the dev plan + a read of the Layer C TTY UI in src/decision-session/TtySelectFn.ts. ## 1. injectFn contract — addresses Drift hi0001234d#3 (primary text-editing path) prompt-injection.ts now defines: - `OptionInjector = (text: string) => Promise<boolean>` — the contract for a direct-injection function (agent-specific, lives in B4). - `PromptInjectionDeps.injectFn?` — optional adapter-supplied injector. B3 default is absent → clipboard fallback always wins. handleOptionSelection now has two paths: 1. If `deps.injectFn` is provided AND `injectFn(text)` resolves true: skip clipboard. Text is in the chat input. Done. 2. Otherwise (injectFn absent, returned false, or threw): fall back to clipboard + info toast. B4 (cursor-windsurf-adapters / M9 + M10) will: - Discover Cursor / Windsurf command ids that write text to the AI chat input (via `vscode.commands.getCommands(true)`). - Implement `cursorChatInputInject` / `windsurfChatInputInject` of type OptionInjector. - Pass them through the view-provider constructor's onSelect arg as: const onSelect = (text) => handleOptionSelection(text, { injectFn: cursorChatInputInject }); Decision saved to memory at ~/.claude/projects/-home-emptyops-Documents-Vedanshi-NexPathMain-reviewduel/memory/project_b4_prompt_injection_contract.md — marked load-bearing (do not delete or rename the named symbols). This guarantees the deferred work doesn't get forgotten in a future session. 4 new unit tests in prompt-injection.test.ts: - injectFn returning true → clipboard NOT touched - injectFn returning false → falls back to clipboard - injectFn throwing → falls back to clipboard - injectFn absent → clipboard path (default B3 behaviour) ## 2. Keyboard shortcuts — addresses Drift hi0001234d#2 (Layer C UX consistency) After reading TtySelectFn.ts, the relevant UX patterns to mirror: - Ctrl+X = opt-out / dismiss (matches Layer C's `\\x18` keypress handler at TtySelectFn.ts:128 + the install disclosure copy: "press Ctrl+X during an advisory") - Esc = standard web cancel (TTY doesn't have this but it's expected web UX) Added to the webview HTML script: - keydown handler for Ctrl+X → dispatches `{type: 'dismiss'}` - keydown handler for Esc → dispatches `{type: 'dismiss'}` - keydown handler for digits 1-9 → dispatches `{type: 'select'}` against the Nth option (matches the visible numbering) - First option focused on render so keyboard users land on something actionable. - Visible kbd-hint text in the options header and on the dismiss button so the shortcuts are discoverable. Patterns NOT mirrored (intentional, rationale): - TTY's two-step "Send to Claude now" / "Copy to clipboard" sub-menu: redundant in the webview — until B4's injectFn lands, every path ends in clipboard anyway. The two-step adds friction without value. - 60s auto-dismiss timeout: the webview is non-modal; the user can let it sit indefinitely. Adds complexity without UX gain. - Arrow-key navigation (Tab already works natively in HTML; number keys are the faster path for our short option lists). 5 new unit tests in html.test.ts: - keyboard hint string visible in options header - hint range scoped to option count (capped at 9) - keydown handler dispatches select on digits 1-9 - Esc + Ctrl+X handlers dispatch dismiss - first option button focused on render ## Verification - Sub-package tsc --noEmit clean - Sub-package vitest: 72/72 pass (was 63, +9 new) - Root tsc --noEmit clean - Full root test suite: 1898 passing + 18 pre-existing TtySelectFn carry-forward - Esbuild bundle: 11.0 KB → 12.3 KB (the new keyboard handler script + injectFn branch) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-confirmation audit caught one real resilience gap + two missing
unit-test coverage points. All scoped to M2/B3 work.
## Resilience fix in view-provider.ts
NexpathDecisionSessionViewProvider.handleMessage previously did:
await this.onSelect(msg.optionLabel);
If onSelect rejected (which a real B4 injectFn can — e.g. when a Cursor
command is missing or throws), the rejection propagated up. The caller
chain is `webview.onDidReceiveMessage` → `void this.handleMessage(raw)`
in resolveWebviewView, which has no `await` to catch it — so it would
have surfaced as an unhandled promise rejection in the extension host.
Fixed by wrapping the onSelect call in try/catch + console.error. The
user-facing error path stays in handleOptionSelection (which already
shows a toast on clipboard failure); the catch here is a last-resort
guard so the extension host doesn't see unhandled rejections.
## 3 new unit tests covering previously-untested behaviour
view-provider.test.ts (+2):
- "a second publishPayload replaces the first (no stacking)" — verifies
the latest payload wins, both currentPayload and webview.html
reflect it, view.show is called per publish.
- "catches errors from onSelect so they never become unhandled
rejections" — proves the new try/catch works + the error is logged
to console.error with the right prefix.
html.test.ts (+1):
- "escapes attribute-breaking quote characters in option id and label"
— the existing escape test covered `<` `>` `&`. Quotes (`"`) inside
a data-option-id="..." attribute would close the attribute and
allow injection. Verifies escapeHtml correctly converts `"` to
`"` in both data-option-id and data-option-label.
## Verification
- Sub-package tsc --noEmit clean
- Sub-package vitest: 75/75 pass (was 72; +3)
- Root tsc --noEmit clean
- Full root test suite: 1901 passing + 18 pre-existing TtySelectFn
- Esbuild bundle: still builds clean (~12.3 KB)
Closes the M2/B3 unit-test audit gap. Per auto-commit rule.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…actors to wire alongside the view-provider from B3
…ursor/Windsurf CLI adapters live in src/agents/adapters/ which is M1's territory
Branch 4 of Milestone M2 (v0.1.3/m2/cursor-windsurf-adapters). This is
the integration branch — stacked on B3 (`3d0957e`), with B2
(`94d81dc`) merged in (`536bca8`) and M1 (`66dd54b`) merged in
(`21f3f48`) so all four prerequisite contracts are available in one
working tree: M1's adapter registry, B2's chat-history watcher +
extractors, B3's webview view-provider + injectFn contract.
This commit covers the narrow dev-plan scope for B4: the CLI-side
adapters (M9 + M10). The bigger wiring (extension host-detection,
chat-input injector, WAL fix for the production watcher, and
extension.ts activate wiring) lands as a follow-up commit on this
same branch — keeps the diff reviewable.
M9 — src/agents/adapters/cursor.ts: VSCodeExtensionAdapter.
- detect() checks for Cursor's OS-specific config dir
(~/.config/Cursor on linux, Library/Application Support/Cursor on
darwin, %APPDATA%/Cursor on win32).
- install() prints deep-link install instructions when Cursor is
present (Open VSX + VS Code Marketplace URLs + cursor
--install-extension CLI fallback). Returns status: 'skipped' if
Cursor isn't installed.
- chatHistoryPaths() returns the User/workspaceStorage base dir; the
extension enumerates per-workspace state.vscdb files at activation
time, not at install time.
- extractPrompt() returns null. The architecture interface declares
it for symmetric API shape, but actual row decoding lives in the
extension runtime via src/ext-vscode/src/extractors/ — the CLI
never runs the watcher. JSDoc spells this out.
- Self-registers via the agent registry side-effect on module load.
M10 — src/agents/adapters/windsurf.ts: same shape as cursor.ts.
- Windsurf is also a VS Code fork; ships the same extension.
- Detection checks BOTH ~/.config/Windsurf/ AND the legacy
~/.codeium/windsurf/ Cascade directory (Windsurf may populate
either or both depending on version). chatHistoryPaths returns
both for the watcher to track. extractPrompt stubbed identically.
src/agents/index.ts: side-effect imports both adapters so
`nexpath install` picks them up via the registry's detectAll/getAdapter.
Tests (31 new, both adapters):
- cursor.test.ts: 15 tests covering cursorConfigDir × 4 OS branches,
static fields, detect (present/absent), chatHistoryPaths shape,
extractPrompt-returns-null, install (skip + present + log content),
uninstall (skip + present), registry self-registration.
- windsurf.test.ts: 16 tests covering the same surface area + the
"detect by EITHER config dir" branches (windsurf-only,
codeium-only, both).
Verification:
- Root tsc --noEmit clean.
- Full root test suite: 2047 passing + 18 pre-existing TtySelectFn
carry-forward (M1 §3.0 carry-forward, unrelated).
- Snapshot invariant preserved — no install.ts modification.
Deferred to a follow-up commit on this same branch (within scope):
- WAL fix: switch chat-history-watcher.ts's default reader from
sql.js to better-sqlite3 (per dev plan §2.5). The dev-only dump
script already uses better-sqlite3.
- Extension host-detector (decide Cursor vs Windsurf vs plain VS Code
at activation time via vscode.env.appName).
- chat-input-injector implementing the OptionInjector contract (per
memory project_b4_prompt_injection_contract.md) — try Cursor /
Windsurf chat-input commands via vscode.commands.executeCommand
then fall back to clipboard.
- extension.ts wiring: construct WatchTargets from host-detector
paths, hook watcher.onEvent to spawn nexpath auto/stop, publish
payloads to the view provider with the injectFn-aware onSelect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ovider wiring
Wires the architectural pieces that B4's narrow scope (cursor + windsurf
CLI adapters) leaves open. Four concerns, all closely related, all sized
to land together.
## 1. WAL-mode fix — chat-history-watcher's default reader
Dev plan §2.5 flagged this as a B4-decision: sql.js operates on a
buffer of the main .vscdb file and CANNOT see the WAL siblings
(.vscdb-wal / .vscdb-shm), which is where Cursor 3.4.20's live writes
actually land. The dev-only dump-cursor-state script already uses
better-sqlite3 + a copy-to-staging-dir strategy; this commit lifts the
same approach into the production watcher.
Changes:
- Swapped sql.js → better-sqlite3 in the production watcher's default
readItemTableFn. The new reader copies main + .vscdb-wal + .vscdb-shm
to a tmp staging dir, opens read-only, runs
PRAGMA wal_checkpoint(TRUNCATE) belt-and-braces, queries ItemTable,
cleans up the staging dir.
- **API change:** ReadItemTableFn signature changed from
`(dbBytes: Buffer) => Promise<rows>` to
`(dbPath: string) => Promise<rows>` so the reader can access the WAL
siblings itself (the buffer-based form couldn't). Watcher no longer
needs readFileFn — removed it from ChatHistoryWatcherOptions. Tests
updated to match (one test scenario removed: the readFileFn error
forwarding test — the failure mode is now subsumed by
readItemTableFn errors which has its own test).
- Defensive: if ItemTable doesn't exist on the file (freshly-created
state.vscdb), reader returns [] rather than throwing.
Package + bundle changes:
- better-sqlite3 moved from devDependencies to dependencies (now a
runtime dep, not just dev-only).
- sql.js removed from dependencies (no longer used by either the
watcher or the dump script).
- esbuild external list updated: 'vscode', 'better-sqlite3'. The
.vsix needs to ship node_modules/better-sqlite3 with prebuilt
binaries for each platform — Branch 6 (publish) responsibility.
## 2. host-detector — Cursor vs Windsurf vs plain VS Code
src/ext-vscode/src/host-detector.ts — small pure module:
- classifyHost(appName): maps "Cursor*" → cursor, "Windsurf*" →
windsurf, everything else → vscode-generic.
- detectHost(deps?): reads vscode.env.appName (or injected override).
- chatHistoryBaseDir(inputs?): per-host OS-specific config dir;
returns null for vscode-generic (no AI chat to watch).
- workspaceStorageDir(inputs?): appends User/workspaceStorage to the
base — the directory the watcher will enumerate for per-workspace
state.vscdb paths.
11 unit tests covering all host × platform × inputs combinations.
## 3. chat-input-injector — fills the B4 injectFn contract
src/ext-vscode/src/chat-input-injector.ts — implements OptionInjector
per memory `project_b4_prompt_injection_contract`:
- For vscode-generic host → returns false immediately (no AI chat to
inject into; clipboard fallback wins).
- For cursor / windsurf:
1. Get the live command list via vscode.commands.getCommands(true).
2. Try each host-specific candidate id (in order). First one that
executes without throwing returns true.
3. If none available or all fail → returns false (clipboard fallback
takes over in handleOptionSelection).
**Candidate command IDs are HEURISTIC GUESSES** based on community
documentation. They're explicitly marked unverified — Branch 5
(smoke-test) is where the engineer hand-verifies against a live
Cursor / Windsurf, prunes / re-orders the list. Until then the
practical net effect on Cursor 3.4.20 is "no match → fall through to
clipboard", which is the safe behaviour.
8 unit tests covering: vscode-generic short-circuit, cursor happy
path, candidate-try-order, command-list filtering, all-fail-fallback,
getCommands throwing, windsurf branch, exported candidate list shape.
## 4. extension.ts wiring
extension.ts now constructs the view provider with an injectFn-aware
onSelect:
const host = detectHost();
const onSelect = (text) =>
handleOptionSelection(text, {
injectFn: (t) => chatInputInject(t, { host }),
});
viewProvider = new NexpathDecisionSessionViewProvider(
context.extensionUri,
onSelect,
);
The chat-history watcher is NOT yet started in activate() — that
wiring is deferred to a B5 follow-up where it can be smoke-tested
against a real Cursor instance (the "stop trigger" timing — when do
we call `nexpath stop`? — needs real-Cursor behaviour to settle).
For now the view-provider just sits with the empty-state HTML; B4 +
B5 close the loop.
extension.test.ts updated:
- Added 4 new vi.mock blocks for the new modules (prompt-injection,
host-detector, chat-input-injector) + extended the vscode mock
with env.appName + commands.
- Adjusted "constructs the view provider" test to expect the second
onSelect argument.
## Verification
- Root tsc --noEmit clean.
- Sub-package tsc --noEmit clean.
- Sub-package vitest: 181/181 pass (was 160 baseline; +21 from B4
follow-up: 11 host-detector + 8 chat-input-injector + 2 net
elsewhere).
- Full root suite: 2068 passing + 18 pre-existing TtySelectFn
carry-forward.
- Esbuild bundle: 14.7 KB (was 12.3 KB in B3 — added host-detector +
chat-input-injector + wiring).
## Memory update
The `project_b4_prompt_injection_contract` memory said B4 must
"Implement cursorChatInputInject / windsurfChatInputInject of type
OptionInjector. Wire them through the view-provider constructor's
onSelect arg." Done — both halves filled in. The memory remains
load-bearing because the symbols still exist; what's changed is
the candidate command list is now an EDUCATED-GUESS placeholder
awaiting real-Cursor verification in Branch 5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Prompt docs Closes the two drifts surfaced in the M2/B4 cross-confirmation review. ## Drift hi0001234d#2 — installAction + uninstallAction now invoke registry adapters Before this commit: cursorAdapter and windsurfAdapter self-registered in the agent registry but `nexpath install` never called them. The legacy `detectAgents()` for-loop only routed `claude-cli` agents through the registry (`getAdapter('claude-code').install`); everything else was gated by `REGISTER_MCP_SERVER = false`. The B4 acceptance line "`nexpath install` detects both, prints correct deep-link instructions, registers with registry" was strictly not met — the adapters existed but weren't reached. Fix: add a small registry-iteration block AFTER the legacy for-loop in both `installAction` and `uninstallAction`. Iterates `await detectAll(adapterCtx)` and calls `adapter.install(ctx)` / `adapter.uninstall(ctx)` for every detected adapter except `claude-code` (already handled in the legacy loop above). - 17 new lines in installAction + 17 new lines in uninstallAction. - Built `adapterCtx: InstallContext` from `homedir()` + `process.cwd()` + `dbPath` so adapters read the same OS-level paths they do when called directly. - Errors from `adapter.install(ctx)` are caught + logged as a single `✗ <label> — failed: <message>` line; don't halt the loop. - Imports: added `detectAll` to the existing `import { getAdapter } from '../../agents/registry.js'` line + `InstallContext` type to the existing types import. ## Snapshot invariant — preserved byte-identical The B1 install-snapshot test (`install.snapshot.test.ts`) runs in a tmp HOME that has neither `~/.config/Cursor` nor `~/.config/Windsurf`. The cursor + windsurf adapters' `detect()` return false → registry iteration block prints nothing → install-snapshot bytes are unchanged. CI fails red on snapshot diff. Verified: snapshot test passes post-change. ## 6 new install/uninstall tests `install.test.ts` now has 6 additional `installAction` / `uninstallAction` tests covering the new registry behaviour: - "calls cursor adapter and prints deep-link instructions when Cursor is detected" — sets up `~/.config/Cursor` inside the test tmpDir, stubs HOME to tmpDir, asserts the Cursor block appears. - "calls windsurf adapter and prints deep-link instructions when Windsurf is detected" — same. - "prints both cursor + windsurf deep-link blocks when both are detected". - "does NOT double-invoke the claude-code adapter from the registry loop" — counts `advisory hook written to` lines == 1 (would be 2 if the registry loop double-called claude-code). - "calls cursor adapter uninstall and prints uninstall instructions when Cursor is detected" — mirror in uninstallAction. - "calls windsurf adapter uninstall and prints uninstall instructions when Windsurf is detected" — mirror. All use `vi.stubEnv('HOME', tmpDir)` to keep the tests hermetic and independent of whether the dev machine actually has Cursor / Windsurf installed. ## Drift hi0001234d#5 — extractPrompt JSDoc upgrade (no functional change) `extractPrompt` on `cursorAdapter` and `windsurfAdapter` remains a stub that returns null. The architectural decision was already made in the original B4 commit; this commit upgrades the JSDoc to: - Explain WHY it's a stub (no CLI caller decodes rows — the extension's chat-history-watcher does, via the extractor modules in `src/ext-vscode/src/extractors/`). - Document the migration path if a CLI tool ever needs row decoding: promote extractors to `src/agents/chat-history-extractors/`, widen sub-package tsconfig.rootDir, leave re-export shims at the old paths, wire the adapter's `extractPrompt` to call `pickExtractor` + `extractor.decodeRow`. - Acknowledge this is a non-trivial cross-tree refactor (esbuild externals + tsconfig + vitest config) and intentionally deferred because there's currently no caller demanding it. Per the user's no-code-removing constraint: nothing removed. The stub behaviour is contract-compliant ("null = I don't know"). When a CLI caller appears, the migration path is documented in the JSDoc. ## Verification - Root tsc --noEmit clean. - Sub-package tsc --noEmit clean. - Full root test suite: 2074 passing + 18 pre-existing TtySelectFn carry-forward (was 2068; +6 from the new registry install/uninstall tests). - Sub-package vitest: 181/181 pass unchanged. - install-snapshot test passes byte-identical (zero-diff invariant preserved). - Esbuild bundle: 14.7 KB unchanged. The B4 acceptance line now strictly holds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the two unit-test gaps surfaced in the M2/B4 audit. Both identified previously-untested production paths. ## Gap 1 — defaultReadItemTable (WAL-aware better-sqlite3 reader) The watcher's tests all inject a mock readItemTableFn, so the real production reader added in M2/B4 follow-up (commit fa3c134) was never exercised. That reader contains the entire WAL fix from dev plan §2.5: copy main + .vscdb-wal + .vscdb-shm → tmp staging dir, open better- sqlite3 read-only, run PRAGMA wal_checkpoint(TRUNCATE), check for ItemTable existence, query, cleanup. A typo in any of those steps would have shipped untested. Fix: exported defaultReadItemTable from chat-history-watcher.ts and added 4 integration-style tests that build real .vscdb files with better-sqlite3 directly, then assert the production reader handles them correctly: - happy-path: 3 ItemTable rows, all retrieved correctly - defensive: .vscdb with NO ItemTable returns [] (real production scenario — freshly-created VS Code state.vscdb) - WAL-mode: .vscdb opened with `PRAGMA journal_mode = WAL` (the Cursor scenario) → rows in WAL siblings are read correctly - source-untouched: 3 consecutive reads do not modify the source file's size — verifies the copy-to-staging strategy keeps the live file safe (important when Cursor is actively writing) ## Gap 2 — adapter-error catch blocks in install/uninstall The audit follow-up (commit 55477c2) added registry-iteration blocks in installAction + uninstallAction with `try/catch` around each adapter call. The happy path got 6 tests. The catch blocks (`✗ <adapter> — failed: <message>`) didn't. Fix: 2 new tests using `vi.spyOn(cursorAdapter, 'install').mockRejectedValueOnce` to simulate a real adapter failure. Each test asserts: - The synthetic error is surfaced in the console output - The loop continues (windsurf's install/uninstall also ran) This proves the registry loop's resilience guarantee — one failing adapter doesn't halt the others. Mirrors the proven legacy for-loop's catch block contract. ## Verification - Root tsc --noEmit clean. - Full root suite: 2080 passing + 18 pre-existing TtySelectFn carry-forward (was 2074; +4 watcher reader + +2 install/uninstall catch = +6). - Watcher test count: 14 (was 10) — defaultReadItemTable suite added. - Snapshot invariant preserved (no install.ts source change). ## What's still NOT tested (acceptable gaps) - chatInputInject candidate command IDs against a live Cursor. These are heuristic guesses; B5 (smoke-test) is where they're verified against a real running instance. Documented in JSDoc. - Extension watcher start-up wiring (intentionally deferred to B5). - extractPrompt(rowKey, rowValue) on cursor/windsurf adapters — stub returns null; documented architectural choice. No caller exists. Closes B4 unit-test audit. Per auto-commit rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 5 of Milestone M2 (v0.1.3/m2/smoke-test). Stacked on B4 (f6a916b), which already has B1+B2+B3+M1 merged in — so this branch has every prerequisite in one working tree. This commit closes the only remaining wiring gap from B4 (chat-history watcher start-up was deferred from B4 to here so it could be smoke- tested against a real Cursor). After this, the full chain works: Cursor types prompt → state.vscdb wal-write fires → fs.watch event in chat-history-watcher → debounce + read → extractor decodes user prompt → chat-pipeline.handleChatEvent → ipc.spawnAuto(prompt, workspace-prefixed session id) → ipc.spawnStop(session id) → DecisionSessionPayload | null → view-provider.publishPayload (if non-null) → webview auto-reveals with advisory + numbered options + keyboard shortcuts (1-9 / Esc / Ctrl+X) → click / number key → handleOptionSelection → injectFn primary path (per-host chat-input command) → clipboard + toast fallback ## Files - src/path-enumerator.ts — enumerateStateVscdbPaths(workspaceStorageDir) walks <base>/<workspace-id>/state.vscdb and returns the paths that exist. Returns [] for null base / missing dir / empty dir. Skips non-directory siblings + workspace dirs that lack state.vscdb. Injectable fs for tests; production uses node:fs defaults. +8 unit tests. - src/chat-pipeline.ts — createChatEventHandler(deps) builds the (event) => Promise<void> handler the watcher calls. Orchestrates spawnAuto → spawnStop → publishPayload. Three independent try/catch blocks so a failure at any stage logs + returns without propagating to the watcher (the watcher's onEvent is fire-and-forget; unhandled rejections would crash the extension host). Optional composeSessionId lets the caller prefix the session with workspace id. +7 unit tests covering happy path, null-payload skip, custom composer, each error path, never- propagates guarantee. - src/onboarding.ts — exported CONSENT_KEY (was a module-private const) so extension.ts reads the same globalState key the onboarding writes to. No behaviour change. - src/extension.ts — substantially rewritten activate(). New flow: 1. detectHost() — Cursor / Windsurf / vscode-generic 2. Construct + register the view provider with injectFn-aware onSelect (B4 wiring unchanged) 3. await showOnboardingIfNeeded(context) — consent prompt 4. Watcher gating — all of these must be true to start: - context.globalState.get<boolean>(CONSENT_KEY) === true - host !== 'vscode-generic' (no AI chat on plain VS Code) - enumerateStateVscdbPaths returns at least one path 5. Build watcher targets from the discovered paths, kind 'cursor-sqlite' (extractor selected by fingerprint at read time) 6. Build the chat-event handler with workspace-prefixed session-id composer 7. Create the watcher with onEvent calling the handler; onSchemaUnknown surfacing a friendly toast; onError logging 8. watcher.start() + push a stop disposable onto context.subscriptions deactivate() now also stops the watcher and clears the module-level handles. - src/extension.test.ts — substantially rewritten with vi.hoisted mocks for the new imports (host-detector, path-enumerator, chat-history-watcher, chat-pipeline, ipc). +15 tests covering: - activation log, view-provider registration regardless of consent - watcher NOT started when consent undefined / false / plain VS Code host / no dbs - watcher started when consent=true + host=cursor + dbs present - chat-event handler built with workspace-prefixed composer - watcher.stop() called on deactivate - getViewProvider() lookup - onboarding-rejects-but-rest-continues resilience - SMOKE-TEST.md — manual smoke-test procedure that the engineer runs against a live Cursor / Windsurf install. Walks through: - Build extension + nexpath CLI install + extension install (Extension Development Host or .vsix path) - Activation verification (log line, consent toast, icon, watcher start log) - Trigger round-trip + observe webview auto-reveal - Verify (and update) the heuristic candidate chat-input command IDs in chat-input-injector.ts using vscode.commands.getCommands(true) - Verification table to paste back as B5 acceptance evidence - Troubleshooting section + explicit non-goals (cross-OS = B6, pre-prompt blocking = open question) ## Verification - Root tsc --noEmit clean. - Sub-package tsc --noEmit clean. - Sub-package vitest: 204 / 204 pass across 17 files (was 185 in B4; +19: 8 path-enumerator + 7 chat-pipeline + 4 net extension). - Full root test suite: 2099 passing + 18 pre-existing TtySelectFn carry-forward (was 2080; +19 from B5). - Esbuild bundle: 31.2 KB (was 14.7 KB — the new wiring pulls chat-history-watcher + extractors + chat-pipeline into the extension's bundle now that activate() actually uses them). - Install snapshot byte-identical (no install.ts source touched). ## B5 acceptance gate (manual) The acceptance line for B5 is "End-to-end on dev machine: type real prompt in Cursor → real round-trip → decision UI appears." That's a MANUAL test the engineer runs per SMOKE-TEST.md. The code-side deliverable (the wiring) is in this commit; the acceptance evidence goes back as a verification-table entry in SMOKE-TEST.md. ## Deferred to B6 / M5 (explicitly out of scope) - macOS + Windows verification (B6 — needs VM / physical access) - Marketplace publish (B6) - Real-Cursor verification of candidate chat-input command IDs (a manual step in SMOKE-TEST.md step 6; the engineer updates chat-input-injector.ts based on what they find) - "Response done" detection for a smarter spawnStop trigger time (currently auto + stop fire back-to-back; M5 hardening) - Multi-workspace concurrency formal testing (M5) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Layer C changes) Closes Drift hi0001234d#6 from the M2/B5 cross-confirmation review. The B5 smoke test would have failed at the "advisory appears" step because: (a) `nexpath stop` (Layer C) expects a full `StopPayload` shape on stdin — `{session_id?, cwd, hook_event_name, stop_hook_active, ...}` (see src/cli/commands/stop.ts:37-43). Our `ipc.spawnStop` was only sending `{session_id}`, so `runStop` saw `cwd === undefined` and failed project-root resolution. (b) `nexpath auto` defaults `--project` to `process.cwd()` of the spawned process. Our ipc was inheriting whatever cwd the extension host process had (typically the user's home, not the workspace). `.env` loading and hook-stats writes were therefore landing in the wrong directory. Both fixed inside our layer — `src/ext-vscode/src/ipc.ts` only. Layer C remains entirely untouched (per the boundary rule and your standing instruction). The fix uses Layer C's existing public stdin contract; we just send the right shape. ## Changes `src/ext-vscode/src/ipc.ts`: - Added `cwd?: string` to `IpcOptions`. Documents WHY it's needed (project-root resolution on the Layer C side). - New helper `buildSpawnOptions(opts)` constructs `SpawnOptions` with `stdio: ['pipe', 'pipe', 'pipe']` AND `cwd: opts.cwd ?? process.cwd()`. Both `spawnAuto` and `spawnStop` now use it. - `spawnStop` stdin payload changed from `{session_id}` to the full `StopPayload` shape: { session_id, cwd: opts.cwd ?? process.cwd(), hook_event_name: 'Stop', stop_hook_active: false, } `last_assistant_message` is omitted (we capture user prompts only; no assistant signal yet — M5 hardening concern). - JSDoc on both spawn functions now references the exact Layer C file:line where the contract is defined. `src/ext-vscode/src/extension.ts`: - The chat-pipeline now curries `spawnAuto` / `spawnStop` with the workspace folder's fsPath as `cwd`. When no workspace is open, falls back to `process.cwd()` of the extension host. - Same workspace path is used as the session-id prefix (was already the case; just consolidated to one variable). `src/ext-vscode/src/ipc.test.ts`: - +5 new tests covering: - `spawnAuto` passes opts.cwd to the spawned process options - `spawnAuto` defaults to `process.cwd()` when omitted - `spawnStop` writes the FULL StopPayload shape to stdin (session_id + cwd + hook_event_name='Stop' + stop_hook_active=false) - `spawnStop` defaults stdin cwd to `process.cwd()` when omitted - `spawnStop` passes opts.cwd to spawn options The full-payload-shape test is the load-bearing one — it locks the Layer C stdin contract so a regression breaks loudly. `src/ext-vscode/src/extension.test.ts`: - Adjusted the "composeSessionId" assertion to accept either `process.cwd()` or any path prefix (was pinned to literal 'no-workspace' which no longer matches). `src/ext-vscode/SMOKE-TEST.md`: - Troubleshooting table gained a row for "nexpath auto runs but no advisory appears later" — explains the OPENAI_API_KEY .env path + how to check the prompt-store.db for captured prompts. ## Verification - Root tsc --noEmit clean. - Sub-package tsc --noEmit clean. - Sub-package vitest: 209/209 pass across 17 files (was 204; +5 from the new ipc cwd / payload-shape tests). - Full root test suite: 2104 passing + 18 pre-existing TtySelectFn carry-forward (was 2099). - Esbuild bundle: 31.5 KB (was 31.2 KB — minor growth for the new payload + buildSpawnOptions helper). - Install snapshot byte-identical. The B5 smoke test should now succeed at the "advisory appears" step when the engineer runs it per SMOKE-TEST.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the one B5 unit-test gap surfaced in the audit. extension.ts
constructs the chat-history watcher with three callbacks (onEvent,
onError, onSchemaUnknown) but none of them were exercised by tests.
The most consequential is onEvent — it's the integration proof that
watcher events actually reach the chat-event handler.
The watcher itself is mocked in these tests, so the strategy is:
capture the opts object passed to createChatHistoryWatcher and
invoke each callback directly.
## Changes
`src/ext-vscode/src/extension.test.ts` — +3 tests via a shared
`activateWithWatcher()` helper:
1. "routes watcher onEvent through the chat-event handler (the
integration proof)" — captures the handler returned by
createChatEventHandler, invokes opts.onEvent with a synthetic
event, asserts the tracked handler was called with that event.
This is the load-bearing test — a refactor that silently
disconnects watcher → handler would break it loudly.
2. "watcher onSchemaUnknown surfaces a visible info toast with
path + observed keys" — invokes opts.onSchemaUnknown, asserts
vscode.window.showInformationMessage was called once with a
message containing the path and the first observed key.
3. "watcher onError logs to console.error (does not crash the
extension)" — invokes opts.onError with a fake Error,
asserts no throw + console.error called with the right
prefix.
## Verification
- Root tsc --noEmit clean.
- Sub-package tsc --noEmit clean.
- Sub-package vitest: 212 / 212 pass (was 209; +3).
- Full root suite: 2107 passing + 18 pre-existing TtySelectFn
carry-forward (was 2104).
- Esbuild bundle unchanged (no source-code changes — tests only).
Closes B5 audit. No remaining unit-test gaps in scope.
## What's still NOT tested (acceptable per B5 scope)
- End-to-end against a real Cursor (manual smoke test per
SMOKE-TEST.md).
- Multi-workspace concurrency (deferred to M5).
- "Response done" timing (auto+stop fire back-to-back — deferred
to M5 hardening).
- The watcher's actual fs.watch firing on real state.vscdb writes
(covered by chat-history-watcher.test.ts at the unit level with
a synthetic fs.watch stub).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the gap identified during the M2 cross-confirmation: dev plan §2.3 acceptance hi0001234d#2 calls for the watcher to monitor BOTH state.vscdb AND ~/.codeium/windsurf when host=windsurf, but the extension layer only consumed state.vscdb paths via path-enumerator. The windsurfAdapter already declared the codeium path in chatHistoryPaths(), but nothing read it. Wiring: - host-detector.ts: new windsurfCodeiumDir(home) — local mirror of the CLI-side codeiumCascadeDir convention (sub-package tsconfig rootDir prevents importing from src/agents/) - extension.ts activate(): when host=windsurf and the codeium dir exists at activate-time, append it as a windsurf-dir WatchTarget. Same activate-time-only limitation that path-enumerator already documents for workspaceStorage - chat-history-watcher.ts: * defaultReadWindsurfJsonFiles — production reader: shallow .json scan, skips malformed files + missing dir * readWindsurfJsonFilesFn + decodeWindsurfFn injection points for tests (parallel to the existing readItemTableFn pattern) * processWindsurfTarget — replaces the B2 no-op stub with the real read → decode → dedup → onEvent flow - extractors/windsurf.ts: new decodeWindsurfJsonFile(parsed, sourcePath) pure function. Body is currently a stub (no events) — Windsurf's per-session JSON schema is still TBD per dev plan §2.5. The FS plumbing around it is now real; only field extraction remains, and the JSDoc walks the engineer through the inspection step. Tests (+19 across 4 files, all green): - host-detector.test.ts (+3): windsurfCodeiumDir shape + cross-file invariant with the windsurfAdapter convention + os.homedir default - extractors/windsurf.test.ts (+3): decodeWindsurfJsonFile contract (no-op stub returns [] for empty/null/non-object, always returns an array) - chat-history-watcher.test.ts (+9): processWindsurfTarget pipes read → decode → onEvent; dedups across multiple files in one scan; forwards reader errors to onError; defaultReadWindsurfJsonFiles handles missing dir / empty dir / .json filter (case-insensitive) / malformed JSON skipping / no recursion into subdirectories - extension.test.ts (+4): windsurf-host adds codeium dir when it exists; skips when it doesn't; activates with only the codeium dir (no state.vscdb yet); cursor-host never consults the codeium helper Verification: - Sub-package vitest: 243 / 243 across 18 files (+19 from this fix) - Root vitest: 2141 passing + 18 pre-existing TtySelectFn (+19) - Root + sub-package tsc clean - install.snapshot.test.ts byte-identical (windsurfAdapter detect() still returns false in the snapshot's tmpHome → no leakage) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… liveness) Live manual testing Round 2 surfaced a critical gap: typing prompts in Cursor's Ask mode wrote to ItemTable correctly (verified via dev-probe cursor extract showing the prompts in aiService.prompts), but the watcher never re-read the file. fs.watch on the main state.vscdb file alone is insufficient because SQLite WAL mode (which Cursor uses) routes all writes to state.vscdb-wal first — the main file only changes at checkpoint time, which can be minutes or hours later. Confirmed via filesystem mtimes: state.vscdb last modified 15:27 (when workspace opened) state.vscdb-wal last modified 16:36 (when user typed prompt) state.vscdb-shm last modified 16:36 current time 16:44 fs.watch on the main file never fired. Net effect: extension activated, view provider registered, consent granted, watcher.start() invoked — but no events ever emitted because no filesystem change event reached the listener. Fix in chat-history-watcher.ts start(): for cursor-sqlite targets, register additional fs.watch instances on `<path>-wal` and `<path>-shm`. Their listeners just re-schedule the main target (which already debounces + reads via the WAL-aware defaultReadItemTable that copies all three files to a staging dir + checkpoints before reading). Sibling watch is wrapped in try/catch because WAL/SHM files don't exist until Cursor first writes to the DB; the main-file watch is the fallback for the not-yet-WAL case. Windsurf-dir targets skip the sibling logic (Windsurf uses JSON files, not SQLite, so no WAL). Test infra fix in chat-history-watcher.test.ts: the fake watchFn now actually wires the listener arg to the FakeFSWatcher's 'change' event (mirroring node:fs watch() behaviour). Without this the WAL-fire test couldn't trigger the listener via .emit(). Side benefit: makes existing dedup + debounce tests more rigorous — they previously passed because the synthesised .emit calls were no-ops. Tests +3: - start() registers main + -wal + -shm = 3 watchers per cursor-sqlite target - WAL sibling change triggers re-read of the main target + emits new events - windsurf-dir targets do NOT get WAL siblings watched Also lands scripts/dev-probe.cjs — a multi-command diagnostic tool (store schema/recent/today/search/stats, cursor workspaces/probe/extract, trigger ping/auto, config show, exthost-log). All future manual-testing rounds use this tool so test cells are single commands with no shell-quoting issues. .cjs extension forces CommonJS regardless of the parent package.json's "type": "module". Verification: - Sub-package vitest: 253 / 253 across 20 files (+8 from R2 work) - Root + sub-package tsc clean - vsce package produces clean 2.19 MB .vsix - Bundle contains the new sibling-watch loop (grepped extension.cjs) Engineer action: uninstall old extension dir, re-install the freshly- built .vsix, restart Cursor. Watcher will now fire on every Cursor chat-write within the 250ms debounce window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R2.4 originally instructed engineers to type "delete all files in ~/Downloads" in Cursor's chat. The intent was to trigger Layer C's advisory pipeline so the warning UI would intercept BEFORE any action. But during live testing 2026-05-18, the prompt was entered in Cursor's Agent mode while nexpath's watcher wasn't capturing yet (the WAL fs.watch bug fixed in 13db716). Cursor's Agent interpreted the prompt as a command and permanently deleted ~1.4 GB of files. The advisory pipeline that was supposed to prevent this didn't fire because the upstream capture chain was broken. The lesson: test design must NOT rely on the safety net being functional. Every manual test prompt that contains hazard keywords must be: - Phrased as an information-retrieval question, not a command - Run in Ask mode only (NOT Agent / Composer) Replaced "delete all files in ~/Downloads" with: - "explain why 'rm -rf ~/Downloads/*' is dangerous" - "what are the risks of 'git push --force' to main" - "what does 'DROP TABLE users' do and how do databases prevent accidents" These contain the same hazard keywords (classifier still triggers) but are pure question-asking — Agent mode executing them just retrieves information. No destructive side effect possible. Also added an explicit warning block at the top of Step 5 calling out the Ask-mode-only rule + the question-form contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live manual testing surfaced that our extension's console.log calls
weren't reaching any discoverable log destination — Cursor's exthost.log
captures only Cursor's internal lifecycle events, the Output panel
"Window" / "Extension Host" channels don't show extension stdout, and
the Developer Tools Console is easy to miss + non-persistent. We could
verify the extension activated (via exthost.log lifecycle events) but
couldn't see whether the watcher started, what it observed, or where
the chain broke. This blocked diagnosing why R2.1 captures showed 0
prompts despite the WAL fix being in the bundle.
Fix: extension.ts now creates a dedicated VS Code OutputChannel named
"Nexpath" at activation and routes every lifecycle / watcher event
through it. Engineers + users can open the channel via View → Output →
select "Nexpath" from the dropdown. Each entry is timestamped (ISO 8601)
so timing-sensitive issues (race conditions, debounce windows, spawn
delays) are debuggable from the log alone.
New log entries (in addition to keeping console.log for backwards
compat with the existing test-spy assertions):
- extension activated
- consent state + host (JSON)
- onboarding failed (with full stack)
- enumerated N state.vscdb file(s) + codeiumExists
- no workspace state.vscdb found
- host is plain VS Code
- consent not granted
- watcher started on N file(s)
- watcher event: prompt="..." raw_session_id=... extractor=...
- watcher error
- schema unknown for <path>; sample keys: ...
- extension deactivated
Test: extension.test.ts vi.mock('vscode') now stubs createOutputChannel
to return a mock with appendLine + dispose so activate() doesn't throw
in the test environment. 253/253 sub-package tests still pass.
Engineer action: reinstall the new .vsix. After reload, every nexpath
log line is visible at View → Output → "Nexpath" dropdown — including
the watcher.start() line we couldn't see before, the per-event captures,
and any errors that previously failed silently.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bug) R2.1 live test (round 2) revealed silent prompt-drop: - Watcher started cleanly (verified via new Nexpath Output channel) - aiService.prompts had 10 entries (5 new Ask-mode prompts confirmed) - state.vscdb-wal modified at correct times - Zero "[nexpath] watcher event:" log lines fired - Store had 0 captures since May 7 Root cause: pickExtractor returns a single extractor based on fingerprint-match count. Modern Cursor workspaces have BOTH keys: - aiService.prompts → cursor-v2024-q4 fingerprint match (1 key) - composer.composerData → cursor-v2025-q1 fingerprint match (1 key) The count ties at 1. Registry order picks cursor-v2025-q1 first. But cursor-v2025-q1's ownsKey only returns true for composer.composerData (metadata-only on Cursor 3.4.20 per dev plan §2.5 → 0 events). It returns false for aiService.prompts → all real Ask-mode prompts were silently skipped at the row iteration step. Fix in chat-history-watcher.ts processSqliteTarget: per-row, run EVERY extractor whose ownsKey returns true, not just the fingerprint-winner. pickExtractor is still used to drive the schema-unknown toast (any-match = known schema, no-match = unknown), but the per-row decode loop now uses ALL_EXTRACTORS. extractorCache type bumped from Map<string, ChatHistoryExtractor> to Map<string, readonly ChatHistoryExtractor[]> so the cached set is populated once and reused on subsequent watcher reads. This implicitly also fixes the §2.5 Composer-mode TBD partially: when Composer storage migrates to ItemTable (or a future extractor adds support), the new extractor will run alongside the existing ones without requiring a "pick the right one" decision. Tests +1: - "cursor-sqlite: runs ALL extractors that own a row, not just the fingerprint-winner" — regression test injects rows with BOTH composer.composerData (metadata-only) AND aiService.prompts (real prompt), asserts the real prompt fires as an onEvent. Verification: - Sub-package vitest: 254 / 254 across 20 files (+1 regression) - Root + sub-package tsc clean - vsce package: clean .vsix, grep finds the multi-extractor loop in bundle Engineer action: reinstall the new .vsix. After reload + 1 Ask-mode prompt in Cursor, the Nexpath Output channel will show: [nexpath] watcher event: prompt="..." raw_session_id=... extractor=cursor-v2024-q4 And `dev-probe.cjs store today` will show the captured row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds *.vsix to .gitignore so packaged build artefacts (e.g. src/ext-vscode/nexpath-vscode-linux-x64-0.1.3.vsix) stop appearing as untracked after every `vsce package` run. Build outputs do not belong in git history. Adds eight cursor-live-smoke-*.json fixtures captured 2026-05-16 by scripts/dump-cursor-state.ts during the M2 live-testing session. They pair with the existing cursor-3-4-20-initial-*.json set as a before/after delta of Cursor 3.4.20 state.vscdb contents (initial empty state vs state after a live smoke run). All files are redacted (composer IDs and similar identifiers replaced with asterisks; the "redacted": true flag is set on each). No test currently imports these by name — they are reference snapshots for future extractor / fingerprint-tie debugging. Branch: v0.1.3/m2/publish-and-cross-os-verify Cleanup follow-up to the 2026-05-20 testing checkpoint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
storeToday() was labelling its output with `startOfDay.toISOString().slice(0,10)`, which renders the UTC date. In non-UTC timezones (IST during M2 testing) the header showed the previous day even when the row window was computed correctly from the local-time `startOfDay` Date. The row filter was already correct — only the cosmetic label was wrong, but it caused real confusion when verifying live captures against wall-clock time. Now formats the date from `getFullYear/getMonth/getDate` and labels the line "(local time)" so the source of truth is explicit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…istorical prompts The chat-history watcher's initial pass over an existing state.vscdb was emitting EVERY pre-existing prompt as if it were brand new, on every extension activation. This flooded Layer C with backlog inputs the Claude-Code hook semantics were never designed to handle: Layer C's session-state machine (`SessionStateManager`, untouched) keys on projectRoot and accumulates promptCount across all events it sees, so once N existing prompts cleared the 3-prompt warmup gate (`MIN_PROMPTS_BEFORE_ADVISORY = 3` at `src/cli/commands/auto.ts:51`), multiple advisories would fire back-to-back in rapid succession — well above the intended ~1-per-5-to-7-prompts cadence enforced by the post-advisory cooldown (`POST_ADVISORY_COOLDOWN = 5`). Surfaced during M2 manual testing R2/R3 on 2026-05-20. Layer C was never modified — diff against `upstream/user-experience-improvements-sub-7` for `src/server/`, `src/classifier/`, `src/decision-session/`, `src/store/`, `src/cli/commands/auto.ts`, and `src/cli/commands/stop.ts` is empty. The bug was at the Layer-B → Layer-C boundary: our M2 watcher was feeding Layer C a flood of inputs the original Claude-Code hook flow never sent. Locked decision hi0001234d#6 (Layer C untouched) holds. Fix: introduce `primedTargets: Set<string>` tracking which targets have completed their initial read. On the first read for a target, rows are processed through extractors and their signatures registered in `seenSignatures` as usual — but `onEvent` is NOT called. Subsequent fs.watch fires (when Cursor writes a new prompt to state.vscdb-wal or the main file) then emit only truly-new signatures. This matches the "only fire on NEW prompts" semantics that the Claude-Code hook always provided. Trade-off: a prompt typed during the brief window between Cursor finishing startup and the extension finishing activation will be primed-not-emitted. The very next prompt after that emits correctly. Tests: the existing 'initial-pass after start() reads + emits events' test (which asserted the old buggy behaviour) is updated to assert the new prime-only contract. Four other tests that incidentally assumed the old behaviour are updated to use a prime-empty-then-add-row pattern so they exercise the post-prime emit path. Test count unchanged at 254 across 20 sub-package files; full root tsc clean; 26/26 chat-history-watcher tests pass. Pre-existing TtySelectFn Windows-sim 18 failures carry forward (out of scope per dev plan §3.0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…and read Live R3 testing on 2026-05-20 surfaced ENOENT errors on two of four enumerated workspaces: Cursor cleaned up /home/emptyops/.config/Cursor/User/workspaceStorage/1779274878669 and /1779274961784 between the activate-time path-enumerator scan (11:20:07Z) and the first debounced read (11:20:36Z). `defaultReadItemTable` then threw ENOENT through `copyFile()`, surfaced via `onError` and logged to the Nexpath OutputChannel on every fs.watch fire for those phantom targets. Fix: `defaultReadItemTable` now checks `existsSync(dbPath)` up front and returns `[]` cleanly when the main file is gone. The copyFile is also wrapped in try/catch so a race that wins the existsSync but loses the copy (file deleted between the two syscalls) returns `[]` rather than throwing. WAL/SHM sibling copy is also try/catch'd because those races are common; better-sqlite3 reads the main file regardless once the staged copy is opened. Three new tests: - `prime-then-new-prompt: existing rows are primed silently, NEW row after start() emits once` — end-to-end of the prime-only contract through `createChatHistoryWatcher`, using a realistic extractor that decodes `aiService.prompts` JSON rows (mirrors the Cursor v2024-q4 path that fired the original flood). - `returns [] when the .vscdb path does not exist (host cleaned up workspace between activate and first read)` — direct unit test of the new defensive guard. - `returns [] when the .vscdb is deleted between existsSync and copyFile (race)` — covers the race-window branch. Sub-package tests: 257 / 257 across 20 files (+3). Root tsc clean. Pre-existing TtySelectFn Windows-sim 18 failures carry forward. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…r's FIFO shift
M2 R3 live testing on 2026-05-20 surfaced a second flood path not covered
by the prime-only fix: every Cursor restart with existing Ask-mode chat
history still re-emitted the entire aiService.prompts backlog after the
user submitted ANY new prompt. Diagnostic onTrace lines confirmed initial
pass primed 10 signatures correctly (seenSigs=10), but the next read
after a user-submitted prompt produced 10 NEW signatures (seenSigs=20)
even though the prompts were the same ones already primed.
Root cause: Cursor 3.4.20's `aiService.prompts` is a rolling FIFO buffer
capped at ~10 entries. When the user submits a new prompt the oldest is
dropped and ALL existing prompts shift left by one. The cursor-v2024-q4
extractor used `rawSessionId: prompts-index:${i}` — positional. The
watcher dedup signature `sourcePath|rawSessionId|prompt` therefore
shifted along with the array: "what is 2 plus 2" at index 0 became
"what is ai9" at index 0 after a shift, producing a brand-new signature
that the dedup set didn't recognise. All 10 indices changed text → all
10 signatures looked "new" → all 10 emitted.
Layer C is still byte-identical to sub-7. The prime-only behaviour from
ae5cc49 still works correctly during the stable-FIFO window. This bug
only surfaces when the FIFO has shifted between two reads — exactly the
scenario "user types a new prompt with a full FIFO".
Fix: change cursor-v2024-q4's rawSessionId from `prompts-index:${i}` to
the constant `'ask-mode'`. The watcher signature becomes
`sourcePath|ask-mode|<prompt text>` — driven entirely by text content,
not array position. A shifted FIFO with the same prompts produces the
same signatures, dedup correctly skips them, only the genuinely new
prompt at the tail emits.
Trade-off documented in the cursor-v2024-q4 JSDoc: a user submitting the
*exact same text* twice within the FIFO window will only emit once. The
other Cursor extractors (cursor-v2025-q1, cursor-v2025-q2) and the
Windsurf extractor already use stable session ids (composer/tab/file)
so they're unaffected.
Also lands the diagnostic `onTrace` callback in chat-history-watcher
that surfaced this bug. The trace logs every processSqliteTarget /
processWindsurfTarget invocation with target path, isInitialPass,
rowsLen, primedTargets.size, seenSignatures.size. extension.ts wires
it to the Nexpath OutputChannel so future similar bugs can be diagnosed
from a single test run.
Tests:
- cursor-v2024-q4 test updated: asserts rawSessionId is the constant
'ask-mode' for every prompt (was 'prompts-index:N').
- NEW cursor-v2024-q4 test: FIFO-shift regression — drop oldest +
append newest → only the tail signature is "new".
- NEW chat-history-watcher integration test: wires the real
cursorV2024Q4 extractor into the watcher and asserts that a FIFO
shift produces exactly ONE emit (the newest prompt), not 10.
Sub-package: 259 / 259 tests pass (+2). Root tsc clean. Pre-existing
TtySelectFn 18 failures carry forward.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… tests in place) The optional `onTrace` callback in chat-history-watcher.ts was added in the d9fd2cf chain to diagnose why the prime-only fix didn't catch the FIFO-shift case. It surfaced the exact race in one test run: seenSigs stayed stable at 9 across 7 traces, then jumped to 20 after a single user prompt — proving the dedup signatures were all changing on FIFO turnover. Root cause is now fixed in d9fd2cf with the stable 'ask-mode' rawSessionId, and two regression tests pin the contract (extractor-level + end-to-end through createChatHistoryWatcher). With the bug squashed and the tests in place, the diagnostic is no longer needed. Removing it keeps the production OutputChannel focused on the four signal types that matter to end users: extension lifecycle, watcher start/error, watcher event:, and schema-unknown. If a similar bug surfaces in the future, the pattern is easy to re-add: optional `onTrace` field on ChatHistoryWatcherOptions, fire before primedTargets.add inside processSqliteTarget / processWindsurfTarget, wire to log() in extension.ts. Sub-package tests: 259/259. Root tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cursor + Windsurf route non-modal showInformationMessage toasts to the
silent notification stack (bell icon) instead of surfacing them as
transient bottom-right popups the way VS Code does. Before this fix the
first-launch consent toast in onboarding.ts was invisible until the user
manually clicked the bell — which we saw cause the consent flow to
silently hang during R2.5 live testing.
Fix is one extra line in extension.ts.activate(): when the detected host
is not vscode-generic, call `vscode.commands.executeCommand('notifications.showList')`
immediately before `showOnboardingIfNeeded()`. The call is best-effort
and any rejection is swallowed — discoverability hint, not load-bearing.
VS Code keeps its native bottom-right toast UX (no preempt). Dev plan
§2.2 M11 consent-toast design preserved verbatim.
+4 tests in extension.test.ts covering: showList is called on cursor,
showList is called on windsurf, showList is NOT called on vscode-generic,
showList rejection does not break activation.
263 sub-pkg tests pass; root tsc clean. Live verified on Cursor 3.4.20 and
VS Code: consent toast appears immediately in both hosts.
chat-pipeline.ts's default logger only writes to console.error, which is only visible in Developer Tools Console — invisible to end users at View → Output → Nexpath. spawnAuto / spawnStop failures (e.g. ENOENT when NEXPATH_BIN points at a missing binary) were silently swallowed during B1.4 live testing. Wire a logger field into createChatEventHandler in extension.ts that writes the formatted error to BOTH log() (which appends to the Nexpath OutputChannel) AND console.error (preserves existing test assertions). Non-Error rejection values are stringified via String(err). +2 unit tests covering the wired logger (Error + non-Error inputs). Live-verified 2026-05-22 on Cursor 3.4.20: with NEXPATH_BIN exported to a bogus path BEFORE launching Cursor (so the extension host inherits the env at fork time — Developer: Reload Window does NOT re-read parent process.env, which was yesterday's failed-attempt root cause), Output → Nexpath now shows both the watcher event line and the spawnAuto NexpathBinaryNotFoundError line for a fresh Ask-mode prompt. 265/265 sub-pkg tests pass, tsc clean root + sub-package. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When two Cursor windows are open, every extension instance enumerated all workspaceStorage/*/state.vscdb files. Each instance attributed captured prompts to its own fixed workspaceCwd rather than the source db's workspace — producing duplicate rows in prompt-store.db with one row under the wrong project_root. Two-layer fix: - Enumeration filter: keep only state.vscdbs whose sibling workspace.json#folder matches the instance's own workspaceCwd; each db is now watched by exactly one instance. - Per-event cwd: thread ChatHistoryEvent into spawnAuto/spawnStop deps; resolve cwd from event.sourcePath via sibling workspace.json (cached); fall back to instance cwd for edge cases (missing workspace.json, multi-root .code-workspace). Live-verified in real Cursor: two windows, each prompt landed exactly once with correct project_root. 273/273 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The S01 manual test (2026-05-25) captured 0 prompts despite 15 paste-
submit cycles. Investigation traced the prompts to globalStorage
state.vscdb's cursorDiskKV table (NOT workspaceStorage's ItemTable that
our existing extractors watch). Cursor's modern Composer / Agent mode
(the right-side chat panel, default UX in 3.4.20+) stores conversations
as `composerData:<uuid>` + `bubbleId:<composerId>:<bubbleId>` rows in
the shared globalStorage file. Without this, the extension was
effectively blind to the majority of real-world prompts — silently
emitting zero events and producing the "no advisory ever fires" symptom
in the S01 run.
Layer C remains byte-identical to upstream/sub-7 per the locked
constraint; all changes are in `src/ext-vscode/`.
Changes:
1. New extractor `src/ext-vscode/src/extractors/cursor-composer-bubble.ts`
- Owns rows keyed `cursorDiskKV/bubbleId:<composerId>:<bubbleId>` (the
reader prefixes keys with the table name so existing ItemTable
extractors stay safely unaware)
- Decodes the bubble JSON, filters to user-type bubbles (type=1),
extracts the prompt from `.text`, uses composerId as rawSessionId
so distinct conversations don't dedup against each other
- Trims trailing whitespace (Cursor appends newlines)
- 11 unit tests covering happy path, assistant-skip, malformed JSON,
empty / whitespace text, key-shape edge cases, per-composer scoping
2. `defaultReadItemTable` (`chat-history-watcher.ts`) now reads BOTH
`ItemTable` AND `cursorDiskKV` if either is present. cursorDiskKV
keys are prefixed with `cursorDiskKV/` so they remain distinguishable
from ItemTable rows that might happen to share a key (also serves as
the fingerprint signal for the new extractor).
3. `globalStorageStateVscdbPath()` in `path-enumerator.ts` — sibling
helper to `enumerateStateVscdbPaths`. Returns the path to
`<host-config>/User/globalStorage/state.vscdb` or null when missing.
5 new unit tests.
4. `extension.ts` wires globalStorage as an unconditional WatchTarget on
Cursor hosts (bypasses the R4.3 cross-workspace filter because
globalStorage is a single shared file). Documents the multi-window
caveat: two open Cursor windows would each emit each bubble (no
cross-instance dedup); acceptable v0.1.3 limitation.
5. `ALL_EXTRACTORS` order: composer-bubble first (it owns rows from a
different table; cleaner to dispatch on first per-row check).
Tests:
- 297/297 sub-package tests pass (+16 from prior: 11 in
cursor-composer-bubble.test.ts + 5 in path-enumerator.test.ts
globalStorage block)
- Root tsc clean; sub-package tsc clean
- 18 pre-existing TtySelectFn carry-forward failures (Layer C, out of
scope per dev plan §3.0)
Bundle: 42 KB (+2.3 KB from prior 39.7 KB). Deployed to both
~/.cursor/extensions/emptyops.nexpath-vscode-0.1.3/out/ and
~/.vscode/extensions/emptyops.nexpath-vscode-0.1.3/out/.
Closes F2 gap from dev plan §2.10 (Composer/Agent mode capture).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered live during S01 manual testing 2026-05-27: Cursor 3.x writes Composer prompts to BOTH globalStorage cursorDiskKV (decoded by cursor-composer-bubble as a bubbleId rawSessionId) AND workspaceStorage ItemTable.aiService.prompts (decoded by cursor-v2024-q4 as the constant 'ask-mode' rawSessionId). The primary signatureOf() dedup keyed by sourcePath|rawSessionId|prompt has different signatures for the two mirror events → both pass dedup → each prompt counted twice → Layer C's prompt_count grows at 2x rate → classifier fires at wrong positions. Live watcher log evidence (S01 P1 captured twice, 1.5s apart): [10:45:42] watcher event prompt="make me a website..." raw_session_id=b6ea7b60-... extractor=cursor-composer-bubble [10:45:44] watcher event prompt="make me a website..." raw_session_id=ask-mode extractor=cursor-v2024-q4 Fix: secondary dedup map `recentPromptTimestamps: Map<string, number>` keyed by trimmed prompt text, with a 60-second window. Real mirror emissions (Composer→Ask arrive within seconds) get deduped. Genuine re-submissions of the same text outside the window pass through. Initial-pass priming subtlety: globalStorage state.vscdb is workspace-agnostic, so initial-pass sees bubble rows from ALL prior workspaces' sessions. Priming entries with NOW timestamp would block fresh user prompts that happen to match an old text within 60s of watcher activation (discovered when S01 P1 was dropped because a historical 'make me a website...' bubble had just been primed). Fix: prime with timestamp 0 (far past) so `now - 0 >> DEDUP_WINDOW_MS` and the entry doesn't block fresh emissions. Three co-located unit tests added: 1. Mirror dedup: same prompt text from two extractors within 60s emits once 2. Window expiry: same prompt re-submitted after 60s passes through 3. Initial-pass priming does NOT block fresh emissions of same text (regression guard for the priming bug) Tests: 33/33 chat-history-watcher (+3 from prior 30). Full sub-package suite: 292/292 (+3). Root tsc clean. Bundle rebuilt + deployed to ~/.cursor/extensions/emptyops.nexpath-vscode-0.1.3/ (md5 6bab1426...). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… spam) The watcher's unknown-schema branch deliberately does not cache an extractor — it must keep re-checking, since a fresh Cursor workspace's state.vscdb starts with no chat keys and only gains them once the user chats. But it re-invoked onSchemaUnknown on every fs.watch fire, which re-logs and re-pops the info toast. On Windsurf this is acute: its workspaceStorage state.vscdb (real chat lives under ~/.codeium/windsurf/) never matches any extractor, so the "schema unknown" log + toast fired endlessly, flooding the Output channel. Fix: track reportedUnknownPaths and notify at most once per path while still re-checking silently thereafter. Co-located test asserts a path read 4× as unknown surfaces onSchemaUnknown exactly once. Note: this only quiets the noise. Windsurf prompt CAPTURE remains unimplemented — decodeWindsurfJsonFile is still a stub and Cascade stores conversations as protobuf (.pb), not the top-level *.json the scanner assumes. Tracked separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live Windsurf 2.0 inspection (2026-05-28) resolved the long-standing "implement decodeWindsurfJsonFile once a real install is inspected" TODO: Cascade encrypts conversations at rest (~/.codeium/windsurf/cascade/<id>.pb, entropy 8.00), the only plaintext is session metadata (title/cwd/timestamps, not the prompt), and workspaceStorage state.vscdb holds no chat. File-watching has no readable prompt to decode, so Windsurf capture is out of v1 scope. Replaces the misleading TODO with the finding so no future engineer repeats the inspection. Decoders stay no-op; host detection + windsurf-dir watch wiring kept inert in case a future non-encrypted path appears. Comment-only; 41 extractor+watcher tests still pass, tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 6 of Milestone M2 (v0.1.3/m2/publish-and-cross-os-verify). Stacked on B5 (4aa977b). The final M2 branch — delivers the code-side infrastructure for publishing the Nexpath VS Code extension to Open VSX + VS Code Marketplace, plus the manual cross-OS verification procedure the engineer follows on macOS + Windows. This branch does NOT itself publish anything or run any cross-OS test — those need: - VSCE_PAT / OVSX_PAT marketplace publisher tokens - Real macOS + Windows hosts (or CI runners) - The engineer's hands on the keyboard Both are documented step-by-step (PUBLISH.md, CROSS-OS-VERIFY.md) so the engineer can run them when ready. ## Module covered (per dev plan §3 M2 §2.2) M14 — Open VSX + VS Code Marketplace publish + macOS / Windows verification. Estimated 8-12 hrs in the dev plan; the code-side portion ships here, the engineer-run portion ships when they run it. ## Files added ### Publishing infrastructure - src/ext-vscode/package.json (modified) Marketplace-ready metadata added: - license: Apache-2.0 (was a stale "SEE LICENSE IN ../../LICENSE") - keywords: cursor, windsurf, ai, prompt, guidance, advisory, nexpath - repository: { type, url, directory: src/ext-vscode } - bugs, homepage - galleryBanner (dark theme, matches the icon palette) - categories: Other, Machine Learning, Programming Languages - engines.node: ">=18" - icon: media/icon.png (was icon.svg; marketplace requires raster) Added devDeps: - @vscode/vsce ^3 — package + publish to VS Code Marketplace - ovsx ^0.10 — publish to Open VSX Added npm scripts: - package (vsce package on current platform) - package:all-platforms (calls scripts/package-per-platform.mjs) - publish:vsce, publish:ovsx - src/ext-vscode/.vscodeignore (NEW) Controls what ships in the .vsix. Excludes src/, tests, dev configs, test fixtures, dump-cursor-state output, .map files, internal docs (SMOKE-TEST.md / PUBLISH.md / CROSS-OS-VERIFY.md), and CI files. Keeps: out/extension.js, media/, package.json, README.md, LICENSE, node_modules/better-sqlite3/ (native dep marked external by esbuild). - src/ext-vscode/LICENSE (NEW — copied from repo root) vsce requires the LICENSE file inside the sub-package directory; a `SEE LICENSE IN ../../LICENSE` reference doesn't satisfy it. Copying is simpler than symlinks (which behave inconsistently on Windows tar extraction). Same Apache-2.0 content as the repo-root LICENSE. - src/ext-vscode/media/icon-marketplace.svg (NEW) + src/ext-vscode/media/icon.png (NEW, 128×128, generated from the marketplace SVG via `convert -density 384`) VS Code Marketplace rejects SVG icons. The activity-bar contribution keeps the original icon.svg (VS Code handles SVG there with currentColor theming). The marketplace listing icon is now icon.png, with the source SVG retained for reproducibility — if the design changes, regenerate via: convert -density 384 -resize 128x128 \\ media/icon-marketplace.svg media/icon.png - src/ext-vscode/src/package-per-platform-helpers.ts (NEW) Pure helpers for the per-platform packager: SUPPORTED_TARGETS list, buildVsixFilename, parsePackagerArgs. Lives in src/ so it's typechecked + co-located test runs. 12 unit tests. - src/ext-vscode/scripts/package-per-platform.mjs (NEW) Runs `vsce package --target <id>` for each target. Drives 5 platforms by default: linux-x64, linux-arm64, darwin-x64, darwin-arm64, win32-x64 (matches better-sqlite3 prebuild list). Output to ./dist-vsix/. Accepts --targets a,b,c + --out <dir>. Each .vsix gets marked platform-specific via vsce's --target flag so the marketplace serves the right binary to each user. ### CI - .github/workflows/publish-extension.yml (NEW) GitHub Actions workflow with three jobs: package — matrix on macos-13 / macos-latest / ubuntu-latest / windows-latest, each produces its platform's .vsix as a CI artefact. Runs npm ci + typecheck + tests + build + vsce package. package-linux-arm64 — Linux arm64 via QEMU emulation inside a node:20-bullseye container. Slower than the native runners but the only way to get the arm64 better-sqlite3 prebuild without physical hardware. publish — Pulls every matrix .vsix artefact, then vsce publish + ovsx publish for each. Gated behind VSCE_PAT + OVSX_PAT repo secrets. Triggers on push of a `vsix-v*` tag OR workflow_dispatch with publish=true. Designed so a one-off `gh workflow run publish-extension.yml -f publish=true` produces and publishes the full multi-platform release. ### Engineer-facing procedures (these are MANUAL steps the engineer runs) - src/ext-vscode/PUBLISH.md (NEW) Full publishing procedure: one-time setup (marketplace publisher accounts + PATs), version-bump checklist, two publishing paths (local single-platform, multi-platform via CI), post-publish verification, rollback strategy, marketplace review SLAs. - src/ext-vscode/CROSS-OS-VERIFY.md (NEW) The cross-OS verification matrix the engineer runs after a publish: 6 cells (macOS Intel × {Cursor, Windsurf}, macOS Apple Silicon × {Cursor, Windsurf}, Windows 11 × {Cursor, Windsurf}), each walked through with install + activate + OS-gotcha-resolution + smoke-test reuse from B5. Closes with a fill-in verification table that ships as B6 acceptance evidence. ## Verification done in this commit - Root tsc --noEmit clean. - Sub-package tsc --noEmit clean. - Sub-package vitest: 224 / 224 pass across 18 files (was 212; +12 from the new package-per-platform-helpers tests). - Full root suite: 2119 passing + 18 pre-existing TtySelectFn carry-forward (was 2107; +12). - Local `vsce package --target linux-x64` produces a valid 3.83 MB .vsix containing: out/extension.js, media/ (icon.png + icon.svg + icon-marketplace.svg), package.json, README.md, LICENSE, and node_modules/better-sqlite3/ (the native dep). Zero vsce warnings post-fixes. - Install snapshot byte-identical. ## What B6 does NOT do (and intentionally cannot from this dev box) - **Actual publish to Open VSX or VS Code Marketplace.** Needs VSCE_PAT and OVSX_PAT secrets. The engineer sets these up per PUBLISH.md and triggers the CI workflow (or runs the publish scripts locally). - **Actual macOS / Windows verification.** Needs VMs or physical access to those OSes. The engineer follows CROSS-OS-VERIFY.md. Step 0 of that doc captures the environment, Steps 1-4 walk through install + smoke test, Step 5 cleans up between matrix cells, and the final table is the acceptance evidence. ## M2 status All 6 M2 branches now complete (B1-B6). The engineer's remaining actions to ship M2 v0.1.3: 1. Open the 6 stacked PRs to upstream/user-experience-improvements-sub-7 (or wait for them to merge linearly). 2. Set up marketplace publisher accounts + tokens (PUBLISH.md §1). 3. Run the publish workflow (or `npm run package:all-platforms` + `publish:*` scripts locally). 4. Run CROSS-OS-VERIFY.md on each (OS, host) cell. 5. Paste the verification table back as the M2 closeout evidence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dry-run Resolves dev plan §6 Q3 (VS Code extension marketplace ID): chose emptyops.nexpath-vscode over nexpath.coding-agent-hooks — org-scoped under the emptyops publisher namespace, lower collision risk on both marketplaces, durable across future browser / JetBrains / etc. Nexpath extensions that'll share the publisher. Changes: - package.json#publisher: nexpath -> emptyops - cursor.ts / windsurf.ts MARKETPLACE_ID + their tests - PUBLISH.md / CROSS-OS-VERIFY.md / SMOKE-TEST.md URLs - publish-extension.yml sanity-check URLs Snapshot invariant preserved: install.snapshot.test.ts.snap doesn't embed the marketplace ID (cursor/windsurf detect() returns false in tmpHome -> those adapters never run in the snapshot scenario). Drift hi0001234d#3 fix: pinned CI runners from floating *-latest labels to specific versions (ubuntu-22.04, windows-2022, macos-14) so the matrix is reproducible. macos-13 was already pinned. New PUBLISH.md section "Pre-merge CI dry-run" walks the engineer through triggering the workflow with publish=false to validate every platform's .vsix builds clean before opening the B6 PR. Verification: - Sub-package vitest: 224 / 224 across 18 files - Root vitest: 2119 passing + 18 pre-existing TtySelectFn - Root + sub-package tsc clean - Local vsce package --target linux-x64: clean 3.83 MB .vsix, zero warnings Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Audit revealed that publisher + name in src/ext-vscode/package.json and MARKETPLACE_ID in cursor.ts + windsurf.ts encode the same logical value (the marketplace ID emptyops.nexpath-vscode) but live in three files with no test guaranteeing they agree. If one drifts, vsce/ovsx still publishes under package.json's values, but the adapters' install output prints a stale URL — users land on a 404. New file src/agents/adapters/marketplace-id.test.ts reads package.json at test time and asserts both adapters' marketplace.openVsx + .vsCode equal `<publisher>.<name>`. Three tests: - cursorAdapter agrees with package.json - windsurfAdapter agrees with package.json - cursor + windsurf agree with each other Lives in its own file (not inside cursor.test.ts / windsurf.test.ts) because the invariant is about the file boundary, not either adapter's behaviour in isolation. Verification: - Root vitest src/agents/adapters/: 53 / 53 pass (19 + 15 + 16 + 3) - Sub-package vitest unchanged: 224 / 224 Audit rationale for everything else in B6: - package-per-platform-helpers.ts: 12 tests, covers SUPPORTED_TARGETS, buildVsixFilename, parsePackagerArgs (defaults, parsing, errors) - scripts/package-per-platform.mjs: build orchestration script, exempt per precedent (esbuild.config.mjs is not tested) - package.json / .vscodeignore / LICENSE / icon assets / *.md docs / publish-extension.yml: non-testable config / data / declarative CI Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cosmetic follow-up to commit d4feb8d (which updated package.json#license from "SEE LICENSE IN ../../LICENSE" to "Apache-2.0" but didn't run `npm install` to refresh the lockfile mirror of that field). Caught during the manual smoke-test build step in v0.1.3/m2/publish-and-cross-os-verify. Single 1-line change, no dependency graph changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…activation failure)
Manual install into Cursor surfaced a silent activation failure: extension
dir created under ~/.cursor/extensions/ but no activity-bar icon, no
consent toast, nothing visible in Cursor.
Root cause: package.json has `"type": "module"` (so Node treats every
.js file in the package as ESM), but esbuild bundles out/extension.cjs
the entry as CommonJS — VS Code's extension host loads main via
`require()`, Node's CJS loader sees a .js file under type:module and
throws ERR_REQUIRE_ESM. The extension host catches that, deactivates
silently, and the user sees nothing.
Fix:
- esbuild outfile: out/extension.js → out/extension.cjs
- package.json#main: ./out/extension.js → ./out/extension.cjs
- .vscodeignore comment updated for accuracy
Why .cjs vs other approaches:
- .cjs forces Node's CJS loader regardless of the package's type field
- Doesn't require dropping type:module (which the .ts source's ESM-style
imports rely on via tsc/vitest's NodeNext resolution)
- One-line config change, no source code touched
Locked by new src/package-main-format.test.ts (+2 tests):
- "main file extension is .cjs when type is 'module'" — prevents the
exact regression above by reading package.json directly
- "package.json main matches the esbuild outfile path" — prevents
silent drift between the two config files
The test reads both files at runtime; if either side moves without the
other, the test fails red with a diagnostic explaining ERR_REQUIRE_ESM.
Verification:
- Local require('./out/extension.cjs') now loads cleanly (errors only
with 'Cannot find module vscode' which is expected outside the host)
- Sub-package vitest: 245 / 245 across 19 files (+2 from new tests)
- Root vitest: 2143 passing + 18 pre-existing TtySelectFn (+2)
- Root + sub-package tsc clean
- vsce package --target linux-x64: clean 3.83 MB .vsix, zero warnings,
manifest shows main: './out/extension.cjs', bundle present as
extension/out/extension.cjs
Engineer action required (one-time):
1. Uninstall old broken extension:
rm -rf ~/.cursor/extensions/emptyops.nexpath-vscode-0.1.3
rm -rf ~/.vscode/extensions/emptyops.nexpath-vscode-0.1.3
2. Reinstall the new .vsix via Cursor UI:
Ctrl+Shift+P → "Extensions: Install from VSIX..." → pick the
freshly built nexpath-vscode-linux-x64-0.1.3.vsix
3. Reload Cursor and proceed with manual testing Round 1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ctron 39)
Live install into Cursor surfaced the second activation-blocking bug after
.cjs: Cursor's Extension Host runs inside Electron 39 (NODE_MODULE_VERSION
140), but the .vsix shipped better-sqlite3 compiled for system Node 22
(NODE_MODULE_VERSION 127). The watcher loaded fine via require() of its
JS wrapper, then crashed the moment it tried to instantiate Database()
with the ABI-mismatch error. Extension activation succeeds, decision-
session UI never appears.
Three problems were tangled together:
1. better-sqlite3 v11 (^11 pinned) doesn't compile against Electron 39's
V8 — uses v8::Context::GetIsolate() which was removed. Upgraded to v12
(latest 12.10.0) which fixes the V8 API usage. v12 is API-compatible
with v11 for our usage (same Database / Statement / pragma surface).
2. Even with v12, the .vsix needs better-sqlite3 rebuilt against the
target Electron's ABI, not the system Node's. Standard tool is
@electron/rebuild — added as devDep, pinned to ^3.
3. We need dev-vs-package isolation: `npm test` uses real better-sqlite3
in test fixtures and runs under Node 22 ABI 127. `vsce package` needs
the Electron ABI 140 binary inside the .vsix. Solved by an orchestrator
script that rebuilds → packages → restores in a finally block.
Wiring:
- package.json#nexpathTargets: declares the target Electron version
(39.0.0) + its ABI (140) as a single source of truth. Bump together
when Cursor/VS Code upgrade Electron.
- package.json#scripts: new rebuild:electron + rebuild:node aliases.
Existing `package` script delegates to the orchestrator instead of
calling vsce directly.
- scripts/package-with-electron-abi.mjs: orchestrates electron-rebuild
→ vsce package → npm rebuild restore. Restore runs in a finally-style
block so the dev tree is never left half-built even on package failure.
Forwards extra argv to vsce (e.g. --target linux-x64 --out dist-vsix/).
- scripts/package-per-platform.mjs: switched the per-target loop from
bare `vsce package` to the orchestrator so all 5 platform .vsix builds
get the right ABI.
- .github/workflows/publish-extension.yml: both the matrix job and the
linux-arm64 QEMU job use the orchestrator (was calling vsce directly).
- .vscodeignore: tightened native-module slimming — drops 25 MB of
build-time-only weight (better-sqlite3 src + deps + node-gyp
intermediates + prebuild-install transitive deps). Final .vsix is
2.16 MB vs old 3.83 MB — smaller AND with the correct binding.
Required --force flag: @electron/rebuild caches "already rebuilt for
this Electron version" via a marker file. Without --force, subsequent
runs skip the rebuild even when the disk binary has been restored to
Node-ABI by the prior orchestrator run's cleanup. First package
succeeds, second package silently ships a Node-ABI binary. Caught
during local verification — added --force + a JSDoc comment.
New src/native-binding-abi.test.ts (+6 tests) locks the cross-file
invariant across four files (package.json#nexpathTargets, the rebuild
script, the orchestrator, and the CI workflow). Specifically prevents:
- nexpathTargets.electron drifting from .electronAbi
- rebuild:electron script not reading nexpathTargets.electron
- package script reverting to bare vsce
- orchestrator hardcoding the version instead of reading package.json
- CI workflow calling `npx vsce package` directly
The grep for "npx vsce package" in the CI workflow fails red if anyone
accidentally bypasses the orchestrator.
Verification:
- Local require + new Database(':memory:') under Node 22: works (Node-ABI
binary restored by orchestrator)
- dlopen of the binary INSIDE the produced .vsix fails under Node 22
with "NODE_MODULE_VERSION 140 requires 127" — confirms Electron ABI
- Sub-package vitest: 251 / 251 across 20 files (+6 from new invariant)
- Root vitest: 2149 passing + 18 pre-existing TtySelectFn
- Root + sub-package tsc clean
- .vsix size: 3.83 MB → 2.16 MB (slimmer .vscodeignore)
Engineer action required (re-install Cursor extension):
rm -rf ~/.cursor/extensions/emptyops.nexpath-vscode-0.1.3
rm -rf ~/.vscode/extensions/emptyops.nexpath-vscode-0.1.3
Then Cursor UI: Ctrl+Shift+P → "Extensions: Install from VSIX..." →
pick src/ext-vscode/nexpath-vscode-linux-x64-0.1.3.vsix → Reload.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d4949ce to
85a29cb
Compare
tsc does not preserve the execute bit, so a rebuild of dist/cli/index.js (e.g. after rm -rf dist) leaves it 0644. The npm-link symlink still points at it, so `nexpath` becomes 'Permission denied' and the Cursor extension's spawnAuto fails with 'nexpath binary not found or not executable'. The postbuild step restores 0755 on both bins, cross-platform. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Layer C's `nexpath stop` opens the decision-session popup via gnome-terminal, a DBus client. When Cursor is launched without DBUS_SESSION_BUS_ADDRESS (desktop launchers, remote/VNC, DISPLAY=:1, or an already-running instance from another session), the extension host — and the spawned `nexpath stop` — has no session bus, so gnome-terminal silently fails and the advisory is stop_skipped with no popup (observed on a tester's box: advisory fires, but no window). ipc.resolveSpawnEnv() restores the standard per-user bus (/run/user/<uid>/bus) when the var is absent and the socket exists. Never overrides an existing address; no-ops on non-Linux / missing socket. Extension- side only — Layer C untouched. Co-located tests added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…+ install)
`scripts/package-with-electron-abi.mjs` ran `spawnSync('npx'|'npm'|'vsce', …)`
without `shell:true`. On Windows those are `.cmd` shims that Node's spawn refuses
to execute without a shell (post CVE-2024-27980), so electron-rebuild failed with
exit 1 BEFORE running — the script aborted before `vsce package`, no .vsix was
produced, and `code --install-extension ./nexpath-vscode-0.1.3.vsix` then failed
with ENOENT. Now spawn through the shell on win32 only (POSIX unchanged) and
surface the previously-swallowed spawn error.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r B only) The Layer C decision-session popup is a separate OS window. macOS (osascript `activate`) and Windows (`cmd /c start` → ShellExecuteEx) already foreground it at launch, but Linux's gnome-terminal has no equivalent flag — under GNOME's focus-stealing prevention the window opens BEHIND Cursor, so testers "hardly see" or miss it. New popup-foreground.ts (extension-side, Layer C untouched) raises the "Nexpath — Action Required" window via wmctrl (fallback xdotool), polling for the ~1s it takes to appear; wired into ipc.spawnStop in the real spawn path only. Graceful no-op when not Linux, no display, or neither tool is installed. 9 unit tests; ipc suite still green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…auto failed") ipc.ts buildSpawnOptions did not set `shell`, so on Windows spawnAuto/spawnStop tried to execute `nexpath.cmd` directly — Node refuses to run a `.cmd` shim without a shell (post CVE-2024-27980) → "spawnauto failed" → no capture → no advisory → no popup. Set shell:true on win32 only (POSIX unchanged); CLI args are simple (auto/stop + --db <~/.nexpath path>) so shell concatenation is safe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ture) fs.watch is unreliable on Windows for SQLite files written by another process: WAL checkpoints delete+recreate the -wal/-shm siblings, orphaning the Windows watch handle so it fires once (catching whatever rows accumulated) then goes silent — observed live as "ran 7 prompts, only the first 2 captured (batched at one timestamp), then nothing". Add a polling backstop: when pollMs>0 every target is re-read on that interval IN ADDITION to fs.watch; the dedup set means a poll only surfaces genuinely new prompts (no re-emit). Default off (existing tests unchanged); extension.ts wires pollMs:2000 in production. New unit test simulates fs.watch never firing and proves the poll captures + stop() cancels it. Layer C untouched; 35 watcher tests + 306 full suite pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…but no popup") nexpath auto records project_root = process.cwd() of its spawned process, which the OS canonicalizes (resolves symlinks; normalizes Windows drive-letter case). The extension passed `stop` the NON-canonical workspace path, so stop's exact `WHERE project_root = ?` missed the row auto wrote -> stop_no_pending -> the popup never opened. On macOS the throwaway workspace under /tmp (a symlink to /private/tmp) triggered this every time; on Windows the c:\ vs C:\ drive-case did. (Also why Cursor never appeared in mac Automation: stop never reached osascript.) Fix (Layer B only; Layer C byte-identical): new canonicalizeCwd() (realpath) in resolve-db-workspace.ts; cwdForEvent() canonicalizes the resolved workspace path so auto and stop always agree on project_root. Proven deterministically: stop with the symlink path leaves the advisory 'pending' (miss); with the canonical path it flips to 'shown' (found). 3 unit tests; full suite 309 pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…visory fallback
Two Layer-B-only fixes (Layer C byte-identical) for a reliable cross-OS flow:
1. Terminal-popup selection now lands in the chat input. `nexpath stop` only
ever writes `{decision:'block', reason}` to stdout (its real Layer C contract);
the extension mis-cast that as a `{advisory,options}` payload and fed the
webview garbage, so a user's selection did nothing. `spawnStop` now returns a
typed `StopSelection`, and the pipeline injects the chosen prompt via the
existing chat-input-injector (clipboard fallback). Loop is now complete.
2. In-editor advisory fallback for when the terminal popup can't open (headless
Linux, no terminal emulator, Wayland, missing session bus). New read-only
store reader (`advisory-store-reader.ts`) pulls the advisory Layer C parked in
`pending_advisories`; `advisory-fallback.ts` lights a status-bar item + the
`nexpath.showAdvisory` command that reveals it in the existing webview, where
selecting an option injects it. Reuses the built (previously starved) webview.
Also: cap child stderr buffering (64KB) and validate stop's stdout shape.
ext-vscode: 338 tests pass (+29), tsc clean, bundle builds. No Layer C changes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(Windows)
On Windows, Layer C's `nexpath stop` writes the chosen option to stdout and calls
process.exit(0) — but that forced exit trips a libuv assertion (src\win\async.c:76),
so the process dies with code 3221226505 and the stdout payload can be lost. The
extension was rejecting on any non-zero exit BEFORE using the selection, so the
popup worked but the chosen prompt was never injected into Cursor's chat input.
Fix (Layer B only, Layer C untouched): `nexpath stop` persists the selection to
session_states.lastInjectedPrompt BEFORE it exits. spawnStop now:
- accepts a stdout selection regardless of exit code (covers a crash where
stdout still flushed), and
- on a non-zero exit with no usable stdout, recovers the selection from the
store (new readInjectedPrompt reader) and injects it anyway.
Malformed-stdout-on-clean-exit still rejects. Refactored the staged read into a
shared helper. ext-vscode: 347 tests pass (+9), tsc clean, bundle builds.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…p/no-fallback)
On macOS the advisory fires and is parked correctly, but Layer C's popup is
`spawnSync('osascript', …)` → Terminal.app, and the FIRST Apple-event blocks on
the Automation-permission dialog — so `nexpath stop` can hang indefinitely. The
fallback was triggered only after `stop` returned (onNoSelection), so a blocked
popup meant neither the popup NOR the fallback ever appeared.
Fix (Layer B only): arm the in-editor fallback right after `auto` parks an
advisory — BEFORE the popup runs — via a new `onAfterCapture` hook, so the
status-bar/webview path is available regardless of whether the popup opens,
blocks, or fails. A popup selection still injects and clears the fallback.
Renamed the fallback entry point noteCycleWithoutSelection → armIfPending.
(The popup itself still needs the macOS Automation permission granted — that's
Layer C / OS, untouched here.) ext-vscode: 347 tests pass, tsc clean, bundle builds.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes Milestone M2 (the VS Code extension for Cursor + Windsurf) by delivering M14 — publishing infrastructure for both Open VSX and the VS Code Marketplace, a per-platform
.vsixpackager, andengineer-driven procedures for macOS + Windows verification.
This is the last of 6 stacked branches in M2 (
v0.1.3/m2/{extension-skeleton, chat-history-capture, webview-ui, cursor-windsurf-adapters, cross-host-smoke-test, publish-and-cross-os-verify}). Thebranch base is B5 (
4aa977b); merging this completes M2.Branch is intentionally code-complete but not yet released — the actual marketplace publish + macOS/Windows manual verification are engineer-driven steps that need marketplace tokens and VM access,
documented in
PUBLISH.md+CROSS-OS-VERIFY.md.What changed
Publishing infrastructure
src/ext-vscode/package.json— marketplace metadata (publisher, license, keywords, repository, gallery banner),vsce+ovsxdevDeps,package/package:all-platforms/publish:{vsce,ovsx}npm scripts.
src/ext-vscode/.vscodeignore(NEW) — controls.vsixinclusion list (keepsout/,media/,node_modules/better-sqlite3/; excludessrc/, tests, scripts, internal docs).src/ext-vscode/LICENSE(NEW) — Apache-2.0 copied from repo root (vsce requires file in sub-package dir).src/ext-vscode/media/icon.png(NEW, 128×128) +icon-marketplace.svg(source) — Marketplace rejects SVG icons.Marketplace ID resolved (dev plan §6 Q3)
Open question from the dev plan resolved to
emptyops.nexpath-vscode— option 2 (org-scoped underemptyopspublisher) plus a descriptive extension name. Lower collision risk than the barenexpathpublisher; durable across future browser / JetBrains / coding-agent-X Nexpath extensions sharing the publisher.Applied across:
src/ext-vscode/package.json#publishersrc/agents/adapters/cursor.ts#MARKETPLACE_ID+ testssrc/agents/adapters/windsurf.ts#MARKETPLACE_ID+ testssrc/ext-vscode/{PUBLISH,CROSS-OS-VERIFY,SMOKE-TEST}.md.github/workflows/publish-extension.ymlsanity-check URLsPer-platform packager
src/ext-vscode/src/package-per-platform-helpers.ts(NEW) — pure helpers:SUPPORTED_TARGETS(5 platforms matchingbetter-sqlite3prebuilds),buildVsixFilename,parsePackagerArgs. 12 unittests.
src/ext-vscode/scripts/package-per-platform.mjs(NEW) — runnable orchestrator that loopsvsce package --target <id>over the 5 targets.Cross-file invariant test (audit gap closure)
src/agents/adapters/marketplace-id.test.ts(NEW, 3 tests) — readssrc/ext-vscode/package.jsonat test time and asserts both adapters'marketplace.{openVsx,vsCode}equal<publisher>.<name>.Catches the failure mode where someone bumps the publisher in
package.jsonfor a republish but forgets the adapter constants (or vice versa) — publish would succeed but the installer's printed URLwould 404.
CI workflow
.github/workflows/publish-extension.yml(NEW) — matrix builds one.vsixper platform (linux-x64/linux-arm64via QEMU /darwin-x64/darwin-arm64/win32-x64) on pinned runners(
ubuntu-22.04,macos-13,macos-14,windows-2022). Publish job gated on tag push (vsix-v*) or manual dispatch withpublish=true.Engineer procedures
src/ext-vscode/PUBLISH.md(NEW) — marketplace account setup, PAT handling, version-bump checklist, local vs multi-platform publish paths, post-publish verification, rollback, pre-merge CI dry-runprocedure (
gh workflow run … -f publish=false).src/ext-vscode/CROSS-OS-VERIFY.md(NEW) — 6-cell matrix (macOS Intel/Silicon × Cursor/Windsurf + Windows 11 × Cursor/Windsurf) with per-OS gotchas table (FDA prompt, Gatekeeper, SmartScreen, longpaths).
Testing
Automated
tsc --noEmit: cleanvsce package --target linux-x64: clean 3.83 MB .vsix, zero warnings, contents verified (LICENSE, package.json, README, media/, node_modules/better-sqlite3/, out/extension.js)