V0.1.3/m2/smoke test#9
Open
Vedansi18 wants to merge 37 commits into
Open
Conversation
Adds the new src/agents/ module: four adapter interfaces (HookAdapter, VSCodeExtensionAdapter, CLIWrapAdapter, BrowserExtensionAdapter), an in-process registry (registerAdapter, detectAll, getAdapter), and an empty index.ts placeholder for future adapter registrations. Unit tests in registry.test.ts cover the registry behaviour. Adds src/cli/commands/install.snapshot.test.ts plus its generated baseline snapshot. The snapshot captures current installAction output (settings.json bytes + stdout) with $HOME and platform-dependent strings normalised so the snapshot is portable across machines. This is the zero-diff safety net for M1 Branch 2 (claude-code refactor): that branch must keep this snapshot byte-identical. No existing source code is modified. Per dev plan §1.6 in reviewduel-submodule. Branch: v0.1.3/m1/foundation-scaffold Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the six Claude Code hook helpers (getClaudeSettingsPath,
buildHookCommand, buildStopHookCommand, buildHookEntry, writeHookEntry,
removeHookEntry) from src/cli/commands/install.ts to
src/agents/adapters/claude-code.ts. Function bodies are byte-identical.
install.ts re-exports them so existing imports (and install.test.ts)
continue to work unchanged.
Adds claudeCodeAdapter (HookAdapter) that wraps the moved functions and
self-registers via src/agents/index.ts side-effect import.
installAction's Claude Code branch in the for-loop now delegates to the
adapter via getAdapter('claude-code').install(ctx).
Adds optional settingsPath override to InstallContext so callers can
decouple the target file path from ctx.home — preserves the pre-refactor
pattern where paths.claudeSettings was passed independently of homedir()
(used by install.test.ts to inject custom tmp paths without stubbing
HOME). Without this, tests would write hook entries to the real
~/.claude/settings.json instead of their tmp dir.
Adds src/agents/adapters/claude-code.test.ts (18 unit tests) covering
the moved helpers + adapter contract (detect, settingsPath, buildHooks,
install, uninstall) + the settingsPath override behaviour.
Zero-diff invariant preserved: install snapshot from M1 Branch 1 remains
byte-identical. All 177 relevant tests pass. typecheck clean.
Branch: v0.1.3/m1/claude-code-refactor (off v0.1.3/m1/foundation-scaffold,
which sits on upstream/user-experience-improvements-sub-7).
Per dev plan §3.0 in reviewduel-submodule.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The original install.ts comment was a single line: // Register the advisory pipeline hook (separate from MCP — different file) The previous M1/B2 commit (d93852e) expanded this into a four-line comment explaining the adapter delegation. Per team feedback, comments on existing pre-refactor code should be kept verbatim — the §1.5 strict zero-diff guarantee includes comments on existing code. No behavioural change. Tests + snapshot unchanged (177/177 pass, install snapshot remains byte-identical with M1 Branch 1's baseline). Branch: v0.1.3/m1/claude-code-refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 1 of Milestone M2 (v0.1.3/m2/extension-skeleton). Establishes the
src/ext-vscode/ sub-package with an esbuild-driven build pipeline
(ESM source -> CJS bundle for the VS Code host), activates on
onStartupFinished, and ships the four scoped modules:
M1 - Skeleton: package.json (activationEvents, activity-bar container +
placeholder view backed by viewsWelcome so the icon actually renders),
tsconfig.json, esbuild.config.mjs, src/extension.ts entrypoint.
M5 - IPC stub: src/ipc.ts. spawnAuto(prompt, sessionId) and
spawnStop(sessionId) spawn the nexpath CLI as subprocesses and parse
the decision-session JSON payload from stdout, with typed errors
(NexpathBinaryNotFoundError, NexpathMalformedPayloadError) and
configurable binary-path resolution
(opts.binaryPath -> NEXPATH_BIN env -> 'nexpath' on PATH).
The exact stdin envelope vs. Layer C input contract is intentionally
a stub here; Branch 4 (cursor-windsurf-adapters) finalises it.
M11 - Onboarding: src/onboarding.ts. First-launch consent toast persists
the user's choice to globalState; on macOS, additionally shows a
Full-Disk-Access guidance toast that deep-links to the System
Settings privacy pane.
M12 - Icon: media/icon.svg. Y-fork (branching path) representing
"next path" decision points; monochrome currentColor, scalable.
25 unit tests co-located alongside source (8 onboarding, 11 ipc, 6 extension),
runnable via root vitest with vi.mock('vscode') stubs. Sub-package has its
own tsconfig + package-lock; root tsconfig now excludes src/ext-vscode/ so
each side owns its TS build. Both root and sub-package tsc --noEmit are
clean. Full root test suite: 1851 passing + 18 pre-existing unrelated
TtySelectFn Windows-sim failures (carried forward from dev plan §3.0).
Deferred (flagged for follow-up, not blockers for this branch):
- 5 moderate npm-audit warnings in the esbuild -> vite -> vitest dev chain
(dev-only; will be addressed during M5 hardening).
- IPC stdin envelope contract: real wiring lands in Branch 4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 2 of Milestone M2 (v0.1.3/m2/chat-history-capture). Stacked on M2 Branch 1 (commit 879ed5e). Adds the three scoped modules: M2 - chat-history-watcher.ts: fs.watch on Cursor's state.vscdb and Windsurf's ~/.codeium/windsurf/ dir, debounced (default 250ms), reads ItemTable via injectable readItemTableFn (sql.js by default), diffs against seenSignatures, emits {prompt, rawSessionId, capturedAt, sourcePath, extractorId}. Dependency-injectable throughout (watchFn, readFileFn, readItemTableFn, nowFn) so the unit tests run without sql.js or real fs.watch. M3 - extractors/: four per-version row decoders implementing the ChatHistoryExtractor contract from chat-history-types.ts. - cursor-v2024-q4 (aiService.prompts global key, pre-Composer) - cursor-v2025-q1 (composerData.composerData, Composer era) - cursor-v2025-q2 (cursorAIChatService.chatHistory.<tabId> per-tab keys, current) - windsurf (cascade.* placeholder; real Windsurf decoding lands in Branch 4 alongside windsurfAdapter) Each Cursor extractor handles both `role`/`type` and `content`/`text` field variants seen across minor versions. All four are TODO-flagged for verification against real dumps before Branch 6 publishes — scripts/dump-cursor-state.ts (below) captures those dumps. M4 - pickExtractor in extractors/index.ts: prefix-match each extractor's fingerprintKeys against the observed ItemTable keys, pick the highest match count (ties broken by registry order = newest first). Returns FingerprintResult; unknown schemas surface observedSampleKeys for the "schema unknown" toast hook. scripts/dump-cursor-state.ts: dev-only helper (npx tsx) for capturing state.vscdb fixtures from a machine with Cursor installed. Filters to chat-related key prefixes, optional --redact for sensitive content. Outputs to src/ext-vscode/test-fixtures/state-vscdb-samples/. Sub-package additions: - dependencies: sql.js ^1 (runtime; loaded via dynamic import so wasm boot is lazy). Marked external in esbuild so the .vsix ships node_modules/sql.js rather than inlining it. - devDependencies: tsx ^4 (for running the dump script). 57 new unit tests (sub-package totals: 82 passing across 9 files): cursor-v2024-q4 9 tests cursor-v2025-q1 10 tests cursor-v2025-q2 11 tests windsurf 4 tests extractors/index 12 tests chat-history-watcher 11 tests Verification: root tsc --noEmit clean; sub-package tsc --noEmit clean; sub-package vitest 82/82 pass; full root test suite 1908 passing + 18 pre-existing TtySelectFn Windows-sim failures (carried forward from M1 3.0, unrelated); esbuild bundle still builds out/extension.js. Deferred to follow-up (flagged, not blockers): - Real-dump verification of all 4 extractors (use dump-cursor-state.ts on machines with each Cursor version installed; replace TODO comments in extractors with fixture-driven regression tests). - Windsurf JSON-file decoder (Branch 4). - Wiring the watcher into extension.ts activate() (Branch 3 webview-ui or Branch 4 adapters — depends on UI surface integration). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real-machine inspection on Cursor 3.4.20 (2026-05-15) surfaced three issues with the Branch 2 extractor designs. This commit fixes the verifiable ones, captures redacted fixtures, and documents the still- unknown bits for the next round. Issue 1 — SQLite WAL mode. The dump script previously used sql.js, which only reads the buffer of the main `.vscdb` file. Live Cursor writes go to the sibling `.vscdb-wal` (185 KB while the main file was 4 KB), so sql.js saw "no such table: ItemTable" even though the table exists. Fix: switched the dump script to better-sqlite3 (native, WAL-aware). Copies main + wal + shm siblings to a tmp staging dir before reading so the live Cursor write path is never touched, then runs `PRAGMA wal_checkpoint(TRUNCATE)` on the staged copy for consistency. The PRODUCTION watcher in `chat-history-watcher.ts` still uses sql.js via dynamic import; the same WAL problem will surface when Branch 4 wires the watcher live. Flagged for Branch 4 design — options are: (a) switch the watcher to better-sqlite3 (native binding in .vsix), or (b) implement copy + checkpoint via sql.js. Out of scope for B2. Issue 2 — `cursor-v2025-q1` extractor's fingerprint key was wrong. Community docs said `composerData.composerData`; Cursor 3.4.20 actually uses `composer.composerData`. Updated the key in both the extractor and its tests + the fingerprint test. Open finding: the `composer.composerData` value on a chat-less Cursor 3.4.20 workspace DB is metadata only (selectedComposerIds, migration flags) — not the conversation messages this extractor's decodeRow logic parses for. Logic falls through cleanly (returns [] when the expected `allComposers` field is absent) and the JSDoc now documents that the real Composer message storage location is still TBD and needs a post-chat snapshot to confirm. Issue 3 — `cursor-v2025-q2` extractor's fingerprint prefix (`cursorAIChatService.chatHistory.`) was NOT observed on Cursor 3.4.20. The extractor still ships (in case older versions use it) but the JSDoc now flags this as unverified and points to the dump script for capturing a real fixture before Branch 6 ships. Dump script additions: - Discovers ALL state.vscdb under Cursor's config tree (global + per-workspace) — chat messages live in the workspace DB, not global. - Dumps both `ItemTable` (filtered to chat-related key prefixes) AND `cursorDiskKV` (Cursor 3.x's parallel KV table; currently empty but may hold Composer messages once chats happen). - One output JSON per discovered DB; suffixed with `global` or `workspace-<id>` for traceability. - `--redact` replaces string values > 8 chars with same-length asterisks. Dependencies: - Added better-sqlite3 ^11 + @types/better-sqlite3 ^7 as devDependencies in the sub-package. Dev-only — the production extension bundle is unaffected. Captured fixtures (redacted) — all three DBs from a chat-less Cursor 3.4.20 session, committed for regression testing: - cursor-3-4-20-initial-global.json (9 rows) - cursor-3-4-20-initial-workspace-1778826246907.json (7 rows) - cursor-3-4-20-initial-workspace-empty-window.json (2 rows) Verification: - Sub-package tsc --noEmit clean. - Sub-package vitest 82/82 pass. - Root tsc --noEmit clean. - Full root test suite 1908 passing + 18 pre-existing TtySelectFn carry-forward. Next step (manual, user-driven): submit a real prompt in Cursor's Ask mode AND in Composer mode, then re-run the dump script to capture a chat-bearing snapshot. The new keys / tables that appear will pin down the Composer-mode message storage location, and a follow-up commit will finalise the extractor decode logic against that real data. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the unit-test audit gap surfaced for M2 Branch 2. The dump script
had real business logic (`redactValue`, `shouldKeepItemTable`,
`parseArgs`, `cursorConfigRoot`, `discoverAllStateVscdb`) with zero test
coverage — `redactValue` in particular has data-leak consequences if
buggy.
Extracted the pure / near-pure helpers into a new module:
- `src/cursor-state-dump-helpers.ts` — lives under tsconfig rootDir
so it's typechecked by the sub-package's main `tsc --noEmit` and
auto-picked-up by vitest. Re-exports `KEEP_ITEMTABLE_PREFIXES`,
`shouldKeepItemTable`, `redactValue`, `cursorConfigRoot`,
`discoverAllStateVscdb` (with injectable fs helpers), and
`parseArgs` (returns a tagged-union result instead of calling
`process.exit`, so the error paths are testable).
Co-located tests: `src/cursor-state-dump-helpers.test.ts` — 28 tests
covering:
- `shouldKeepItemTable`: each default prefix matched, unrelated keys
dropped, custom prefix lists, prefix-not-exact match.
- `redactValue`: short-string preservation, long-string redaction,
nested object/array recursion, non-string value preservation, bulk
redact for non-JSON input, JSON-string root, exact 9-char boundary.
- `cursorConfigRoot`: linux / darwin / win32 / unknown-platform paths
and APPDATA fallback.
- `discoverAllStateVscdb`: empty tree, global-only, global + multiple
workspaces, skip workspace dirs missing the DB, injectable fs.
- `parseArgs`: required `--name`, optional `--src` / `--redact`,
`--help` / `-h` signal, missing-value rejection, unknown-argument
rejection.
Script entry-point `scripts/dump-cursor-state.ts` now imports from
`../src/cursor-state-dump-helpers.js` and retains only the I/O
orchestration (file copy to tmp staging dir, better-sqlite3 read, fixture
write). Behaviour is byte-for-byte unchanged — verified by re-running
against the live machine and producing identical row counts to the
previous commit (`3794bc3`).
Sub-package totals:
- Test files: 10 (was 9)
- Tests: 110 passing (was 82) — +28 helpers tests
- Sub-package tsc --noEmit clean
- Root tsc --noEmit clean
- Full root suite: 1936 passing + 18 pre-existing TtySelectFn
Windows-sim failures (M1 §3.0 carry-forward, unrelated)
Closes the only remaining audit gap for M2 Branch 2. No further unit-test
work pending; per the auto-commit rule the branch is now closed pending
push.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 3 of Milestone M2 (v0.1.3/m2/webview-ui). Stacked on M2 Branch 1 (commit 879ed5e) — does NOT depend on M2 B2's watcher, only on B1's skeleton + the DecisionSessionPayload type from ipc.ts. Delivers the three scoped modules: M6 — WebviewViewProvider: src/webview/view-provider.ts. NexpathDecisionSessionViewProvider implements vscode.WebviewViewProvider for the nexpath.status activity-bar view. resolveWebviewView wires webview.options (enableScripts + localResourceRoots), sets initial HTML, registers onDidReceiveMessage + onDidDispose. publishPayload stores the payload, updates the HTML, and calls webviewView.show(true) for the auto-reveal UX matching architecture rev 2 §4. Payload survives view dispose/re-show. Injectable onSelect dependency for tests. Exposes getCurrentPayload() + handleMessage() for direct message-routing tests. M7 — HTML template: src/webview/html.ts. renderDecisionSessionHtml(payload, opts) — pure function, no I/O. Returns the full self-contained HTML for the webview. Two modes: empty/watching state (no scripts, just "Nexpath is active…") and populated state (advisory + numbered option buttons + dismiss). CSP: default-src 'none' with nonce-scoped scripts. All user-controlled strings HTML-escaped. Theming via --vscode-* CSS variables so the UI inherits Cursor/Windsurf's theme. Tests verify both states, nonce handling, HTML escaping (incl. <script> + onerror= injection attempts), and empty-options array. M8 — Prompt injection: src/webview/prompt-injection.ts. handleOptionSelection writes the selected option text to the system clipboard via vscode.env.clipboard.writeText + shows a non-modal info toast directing the user to paste. This is the ONLY reliable path — VS Code text-editing APIs target editor documents, not the host's (Cursor's) chat input panel (dev plan §2.4). Branch 4 may discover a Cursor-specific command that lets us write directly; until then clipboard + toast is the primary path. Injectable deps for tests. extension.ts updates: - Registers the view provider on activate via vscode.window.registerWebviewViewProvider(VIEW_ID, instance). - Pushes the registration disposable onto context.subscriptions for cleanup on deactivate. - Holds the provider at module scope; exposes via getViewProvider() so Branch 4's adapter wiring can publish payloads. - View provider registration runs BEFORE onboarding so the icon shows immediately on activation, even while consent toasts are open. - Onboarding errors still swallowed (per existing B1 behaviour). package.json updates: - nexpath.status view now declares "type": "webview" (was implicit tree). - viewsWelcome entry removed — webview-type views render their own empty state from inside the webview HTML, not via viewsWelcome. The empty state in renderDecisionSessionHtml replaces it. 38 new unit tests: - html.test.ts: 13 (escapeHtml + empty state + populated state + nonce + HTML escaping in advisory/options + empty options array) - view-provider.test.ts: 14 (VIEW_ID + resolveWebviewView × 4 + publishPayload × 3 + clearPayload + handleMessage × 5) - prompt-injection.test.ts: 6 (clipboard write + toast + error paths + DI + empty string) - extension.test.ts: +5 (registration test + subscriptions push + getViewProvider + onboarding-rejects-but-view-still-registered + the deactivate clears viewProvider) Sub-package totals at branch HEAD: 63 tests across 6 files (was 25 in B1, +38 here). Root tsc + sub-package tsc clean. Full root test suite 1889 passing + 18 pre-existing TtySelectFn carry-forward (unrelated). Esbuild bundle grew from 3.4 KB → 11.0 KB (includes the new webview modules + their CSS template strings). Deferred (flagged, not blockers for this branch): - Pre-prompt blocking on Cursor/Windsurf — current architecture only shows guidance after the host sends the prompt. Pre-send blocking would need a keybinding hijack (architecture doc §11 open question 7). - Cursor-specific "write to chat input" command — discover in Branch 4 if it exists, otherwise clipboard + toast remains the only path. - E2E test against a real Cursor instance — Branch 5 (smoke-test) gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two refinements after cross-confirmation review against the dev plan + a read of the Layer C TTY UI in src/decision-session/TtySelectFn.ts. ## 1. injectFn contract — addresses Drift hi0001234d#3 (primary text-editing path) prompt-injection.ts now defines: - `OptionInjector = (text: string) => Promise<boolean>` — the contract for a direct-injection function (agent-specific, lives in B4). - `PromptInjectionDeps.injectFn?` — optional adapter-supplied injector. B3 default is absent → clipboard fallback always wins. handleOptionSelection now has two paths: 1. If `deps.injectFn` is provided AND `injectFn(text)` resolves true: skip clipboard. Text is in the chat input. Done. 2. Otherwise (injectFn absent, returned false, or threw): fall back to clipboard + info toast. B4 (cursor-windsurf-adapters / M9 + M10) will: - Discover Cursor / Windsurf command ids that write text to the AI chat input (via `vscode.commands.getCommands(true)`). - Implement `cursorChatInputInject` / `windsurfChatInputInject` of type OptionInjector. - Pass them through the view-provider constructor's onSelect arg as: const onSelect = (text) => handleOptionSelection(text, { injectFn: cursorChatInputInject }); Decision saved to memory at ~/.claude/projects/-home-emptyops-Documents-Vedanshi-NexPathMain-reviewduel/memory/project_b4_prompt_injection_contract.md — marked load-bearing (do not delete or rename the named symbols). This guarantees the deferred work doesn't get forgotten in a future session. 4 new unit tests in prompt-injection.test.ts: - injectFn returning true → clipboard NOT touched - injectFn returning false → falls back to clipboard - injectFn throwing → falls back to clipboard - injectFn absent → clipboard path (default B3 behaviour) ## 2. Keyboard shortcuts — addresses Drift hi0001234d#2 (Layer C UX consistency) After reading TtySelectFn.ts, the relevant UX patterns to mirror: - Ctrl+X = opt-out / dismiss (matches Layer C's `\\x18` keypress handler at TtySelectFn.ts:128 + the install disclosure copy: "press Ctrl+X during an advisory") - Esc = standard web cancel (TTY doesn't have this but it's expected web UX) Added to the webview HTML script: - keydown handler for Ctrl+X → dispatches `{type: 'dismiss'}` - keydown handler for Esc → dispatches `{type: 'dismiss'}` - keydown handler for digits 1-9 → dispatches `{type: 'select'}` against the Nth option (matches the visible numbering) - First option focused on render so keyboard users land on something actionable. - Visible kbd-hint text in the options header and on the dismiss button so the shortcuts are discoverable. Patterns NOT mirrored (intentional, rationale): - TTY's two-step "Send to Claude now" / "Copy to clipboard" sub-menu: redundant in the webview — until B4's injectFn lands, every path ends in clipboard anyway. The two-step adds friction without value. - 60s auto-dismiss timeout: the webview is non-modal; the user can let it sit indefinitely. Adds complexity without UX gain. - Arrow-key navigation (Tab already works natively in HTML; number keys are the faster path for our short option lists). 5 new unit tests in html.test.ts: - keyboard hint string visible in options header - hint range scoped to option count (capped at 9) - keydown handler dispatches select on digits 1-9 - Esc + Ctrl+X handlers dispatch dismiss - first option button focused on render ## Verification - Sub-package tsc --noEmit clean - Sub-package vitest: 72/72 pass (was 63, +9 new) - Root tsc --noEmit clean - Full root test suite: 1898 passing + 18 pre-existing TtySelectFn carry-forward - Esbuild bundle: 11.0 KB → 12.3 KB (the new keyboard handler script + injectFn branch) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-confirmation audit caught one real resilience gap + two missing
unit-test coverage points. All scoped to M2/B3 work.
## Resilience fix in view-provider.ts
NexpathDecisionSessionViewProvider.handleMessage previously did:
await this.onSelect(msg.optionLabel);
If onSelect rejected (which a real B4 injectFn can — e.g. when a Cursor
command is missing or throws), the rejection propagated up. The caller
chain is `webview.onDidReceiveMessage` → `void this.handleMessage(raw)`
in resolveWebviewView, which has no `await` to catch it — so it would
have surfaced as an unhandled promise rejection in the extension host.
Fixed by wrapping the onSelect call in try/catch + console.error. The
user-facing error path stays in handleOptionSelection (which already
shows a toast on clipboard failure); the catch here is a last-resort
guard so the extension host doesn't see unhandled rejections.
## 3 new unit tests covering previously-untested behaviour
view-provider.test.ts (+2):
- "a second publishPayload replaces the first (no stacking)" — verifies
the latest payload wins, both currentPayload and webview.html
reflect it, view.show is called per publish.
- "catches errors from onSelect so they never become unhandled
rejections" — proves the new try/catch works + the error is logged
to console.error with the right prefix.
html.test.ts (+1):
- "escapes attribute-breaking quote characters in option id and label"
— the existing escape test covered `<` `>` `&`. Quotes (`"`) inside
a data-option-id="..." attribute would close the attribute and
allow injection. Verifies escapeHtml correctly converts `"` to
`"` in both data-option-id and data-option-label.
## Verification
- Sub-package tsc --noEmit clean
- Sub-package vitest: 75/75 pass (was 72; +3)
- Root tsc --noEmit clean
- Full root test suite: 1901 passing + 18 pre-existing TtySelectFn
- Esbuild bundle: still builds clean (~12.3 KB)
Closes the M2/B3 unit-test audit gap. Per auto-commit rule.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…actors to wire alongside the view-provider from B3
…ursor/Windsurf CLI adapters live in src/agents/adapters/ which is M1's territory
Branch 4 of Milestone M2 (v0.1.3/m2/cursor-windsurf-adapters). This is
the integration branch — stacked on B3 (`3d0957e`), with B2
(`94d81dc`) merged in (`536bca8`) and M1 (`66dd54b`) merged in
(`21f3f48`) so all four prerequisite contracts are available in one
working tree: M1's adapter registry, B2's chat-history watcher +
extractors, B3's webview view-provider + injectFn contract.
This commit covers the narrow dev-plan scope for B4: the CLI-side
adapters (M9 + M10). The bigger wiring (extension host-detection,
chat-input injector, WAL fix for the production watcher, and
extension.ts activate wiring) lands as a follow-up commit on this
same branch — keeps the diff reviewable.
M9 — src/agents/adapters/cursor.ts: VSCodeExtensionAdapter.
- detect() checks for Cursor's OS-specific config dir
(~/.config/Cursor on linux, Library/Application Support/Cursor on
darwin, %APPDATA%/Cursor on win32).
- install() prints deep-link install instructions when Cursor is
present (Open VSX + VS Code Marketplace URLs + cursor
--install-extension CLI fallback). Returns status: 'skipped' if
Cursor isn't installed.
- chatHistoryPaths() returns the User/workspaceStorage base dir; the
extension enumerates per-workspace state.vscdb files at activation
time, not at install time.
- extractPrompt() returns null. The architecture interface declares
it for symmetric API shape, but actual row decoding lives in the
extension runtime via src/ext-vscode/src/extractors/ — the CLI
never runs the watcher. JSDoc spells this out.
- Self-registers via the agent registry side-effect on module load.
M10 — src/agents/adapters/windsurf.ts: same shape as cursor.ts.
- Windsurf is also a VS Code fork; ships the same extension.
- Detection checks BOTH ~/.config/Windsurf/ AND the legacy
~/.codeium/windsurf/ Cascade directory (Windsurf may populate
either or both depending on version). chatHistoryPaths returns
both for the watcher to track. extractPrompt stubbed identically.
src/agents/index.ts: side-effect imports both adapters so
`nexpath install` picks them up via the registry's detectAll/getAdapter.
Tests (31 new, both adapters):
- cursor.test.ts: 15 tests covering cursorConfigDir × 4 OS branches,
static fields, detect (present/absent), chatHistoryPaths shape,
extractPrompt-returns-null, install (skip + present + log content),
uninstall (skip + present), registry self-registration.
- windsurf.test.ts: 16 tests covering the same surface area + the
"detect by EITHER config dir" branches (windsurf-only,
codeium-only, both).
Verification:
- Root tsc --noEmit clean.
- Full root test suite: 2047 passing + 18 pre-existing TtySelectFn
carry-forward (M1 §3.0 carry-forward, unrelated).
- Snapshot invariant preserved — no install.ts modification.
Deferred to a follow-up commit on this same branch (within scope):
- WAL fix: switch chat-history-watcher.ts's default reader from
sql.js to better-sqlite3 (per dev plan §2.5). The dev-only dump
script already uses better-sqlite3.
- Extension host-detector (decide Cursor vs Windsurf vs plain VS Code
at activation time via vscode.env.appName).
- chat-input-injector implementing the OptionInjector contract (per
memory project_b4_prompt_injection_contract.md) — try Cursor /
Windsurf chat-input commands via vscode.commands.executeCommand
then fall back to clipboard.
- extension.ts wiring: construct WatchTargets from host-detector
paths, hook watcher.onEvent to spawn nexpath auto/stop, publish
payloads to the view provider with the injectFn-aware onSelect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ovider wiring
Wires the architectural pieces that B4's narrow scope (cursor + windsurf
CLI adapters) leaves open. Four concerns, all closely related, all sized
to land together.
## 1. WAL-mode fix — chat-history-watcher's default reader
Dev plan §2.5 flagged this as a B4-decision: sql.js operates on a
buffer of the main .vscdb file and CANNOT see the WAL siblings
(.vscdb-wal / .vscdb-shm), which is where Cursor 3.4.20's live writes
actually land. The dev-only dump-cursor-state script already uses
better-sqlite3 + a copy-to-staging-dir strategy; this commit lifts the
same approach into the production watcher.
Changes:
- Swapped sql.js → better-sqlite3 in the production watcher's default
readItemTableFn. The new reader copies main + .vscdb-wal + .vscdb-shm
to a tmp staging dir, opens read-only, runs
PRAGMA wal_checkpoint(TRUNCATE) belt-and-braces, queries ItemTable,
cleans up the staging dir.
- **API change:** ReadItemTableFn signature changed from
`(dbBytes: Buffer) => Promise<rows>` to
`(dbPath: string) => Promise<rows>` so the reader can access the WAL
siblings itself (the buffer-based form couldn't). Watcher no longer
needs readFileFn — removed it from ChatHistoryWatcherOptions. Tests
updated to match (one test scenario removed: the readFileFn error
forwarding test — the failure mode is now subsumed by
readItemTableFn errors which has its own test).
- Defensive: if ItemTable doesn't exist on the file (freshly-created
state.vscdb), reader returns [] rather than throwing.
Package + bundle changes:
- better-sqlite3 moved from devDependencies to dependencies (now a
runtime dep, not just dev-only).
- sql.js removed from dependencies (no longer used by either the
watcher or the dump script).
- esbuild external list updated: 'vscode', 'better-sqlite3'. The
.vsix needs to ship node_modules/better-sqlite3 with prebuilt
binaries for each platform — Branch 6 (publish) responsibility.
## 2. host-detector — Cursor vs Windsurf vs plain VS Code
src/ext-vscode/src/host-detector.ts — small pure module:
- classifyHost(appName): maps "Cursor*" → cursor, "Windsurf*" →
windsurf, everything else → vscode-generic.
- detectHost(deps?): reads vscode.env.appName (or injected override).
- chatHistoryBaseDir(inputs?): per-host OS-specific config dir;
returns null for vscode-generic (no AI chat to watch).
- workspaceStorageDir(inputs?): appends User/workspaceStorage to the
base — the directory the watcher will enumerate for per-workspace
state.vscdb paths.
11 unit tests covering all host × platform × inputs combinations.
## 3. chat-input-injector — fills the B4 injectFn contract
src/ext-vscode/src/chat-input-injector.ts — implements OptionInjector
per memory `project_b4_prompt_injection_contract`:
- For vscode-generic host → returns false immediately (no AI chat to
inject into; clipboard fallback wins).
- For cursor / windsurf:
1. Get the live command list via vscode.commands.getCommands(true).
2. Try each host-specific candidate id (in order). First one that
executes without throwing returns true.
3. If none available or all fail → returns false (clipboard fallback
takes over in handleOptionSelection).
**Candidate command IDs are HEURISTIC GUESSES** based on community
documentation. They're explicitly marked unverified — Branch 5
(smoke-test) is where the engineer hand-verifies against a live
Cursor / Windsurf, prunes / re-orders the list. Until then the
practical net effect on Cursor 3.4.20 is "no match → fall through to
clipboard", which is the safe behaviour.
8 unit tests covering: vscode-generic short-circuit, cursor happy
path, candidate-try-order, command-list filtering, all-fail-fallback,
getCommands throwing, windsurf branch, exported candidate list shape.
## 4. extension.ts wiring
extension.ts now constructs the view provider with an injectFn-aware
onSelect:
const host = detectHost();
const onSelect = (text) =>
handleOptionSelection(text, {
injectFn: (t) => chatInputInject(t, { host }),
});
viewProvider = new NexpathDecisionSessionViewProvider(
context.extensionUri,
onSelect,
);
The chat-history watcher is NOT yet started in activate() — that
wiring is deferred to a B5 follow-up where it can be smoke-tested
against a real Cursor instance (the "stop trigger" timing — when do
we call `nexpath stop`? — needs real-Cursor behaviour to settle).
For now the view-provider just sits with the empty-state HTML; B4 +
B5 close the loop.
extension.test.ts updated:
- Added 4 new vi.mock blocks for the new modules (prompt-injection,
host-detector, chat-input-injector) + extended the vscode mock
with env.appName + commands.
- Adjusted "constructs the view provider" test to expect the second
onSelect argument.
## Verification
- Root tsc --noEmit clean.
- Sub-package tsc --noEmit clean.
- Sub-package vitest: 181/181 pass (was 160 baseline; +21 from B4
follow-up: 11 host-detector + 8 chat-input-injector + 2 net
elsewhere).
- Full root suite: 2068 passing + 18 pre-existing TtySelectFn
carry-forward.
- Esbuild bundle: 14.7 KB (was 12.3 KB in B3 — added host-detector +
chat-input-injector + wiring).
## Memory update
The `project_b4_prompt_injection_contract` memory said B4 must
"Implement cursorChatInputInject / windsurfChatInputInject of type
OptionInjector. Wire them through the view-provider constructor's
onSelect arg." Done — both halves filled in. The memory remains
load-bearing because the symbols still exist; what's changed is
the candidate command list is now an EDUCATED-GUESS placeholder
awaiting real-Cursor verification in Branch 5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Prompt docs Closes the two drifts surfaced in the M2/B4 cross-confirmation review. ## Drift hi0001234d#2 — installAction + uninstallAction now invoke registry adapters Before this commit: cursorAdapter and windsurfAdapter self-registered in the agent registry but `nexpath install` never called them. The legacy `detectAgents()` for-loop only routed `claude-cli` agents through the registry (`getAdapter('claude-code').install`); everything else was gated by `REGISTER_MCP_SERVER = false`. The B4 acceptance line "`nexpath install` detects both, prints correct deep-link instructions, registers with registry" was strictly not met — the adapters existed but weren't reached. Fix: add a small registry-iteration block AFTER the legacy for-loop in both `installAction` and `uninstallAction`. Iterates `await detectAll(adapterCtx)` and calls `adapter.install(ctx)` / `adapter.uninstall(ctx)` for every detected adapter except `claude-code` (already handled in the legacy loop above). - 17 new lines in installAction + 17 new lines in uninstallAction. - Built `adapterCtx: InstallContext` from `homedir()` + `process.cwd()` + `dbPath` so adapters read the same OS-level paths they do when called directly. - Errors from `adapter.install(ctx)` are caught + logged as a single `✗ <label> — failed: <message>` line; don't halt the loop. - Imports: added `detectAll` to the existing `import { getAdapter } from '../../agents/registry.js'` line + `InstallContext` type to the existing types import. ## Snapshot invariant — preserved byte-identical The B1 install-snapshot test (`install.snapshot.test.ts`) runs in a tmp HOME that has neither `~/.config/Cursor` nor `~/.config/Windsurf`. The cursor + windsurf adapters' `detect()` return false → registry iteration block prints nothing → install-snapshot bytes are unchanged. CI fails red on snapshot diff. Verified: snapshot test passes post-change. ## 6 new install/uninstall tests `install.test.ts` now has 6 additional `installAction` / `uninstallAction` tests covering the new registry behaviour: - "calls cursor adapter and prints deep-link instructions when Cursor is detected" — sets up `~/.config/Cursor` inside the test tmpDir, stubs HOME to tmpDir, asserts the Cursor block appears. - "calls windsurf adapter and prints deep-link instructions when Windsurf is detected" — same. - "prints both cursor + windsurf deep-link blocks when both are detected". - "does NOT double-invoke the claude-code adapter from the registry loop" — counts `advisory hook written to` lines == 1 (would be 2 if the registry loop double-called claude-code). - "calls cursor adapter uninstall and prints uninstall instructions when Cursor is detected" — mirror in uninstallAction. - "calls windsurf adapter uninstall and prints uninstall instructions when Windsurf is detected" — mirror. All use `vi.stubEnv('HOME', tmpDir)` to keep the tests hermetic and independent of whether the dev machine actually has Cursor / Windsurf installed. ## Drift hi0001234d#5 — extractPrompt JSDoc upgrade (no functional change) `extractPrompt` on `cursorAdapter` and `windsurfAdapter` remains a stub that returns null. The architectural decision was already made in the original B4 commit; this commit upgrades the JSDoc to: - Explain WHY it's a stub (no CLI caller decodes rows — the extension's chat-history-watcher does, via the extractor modules in `src/ext-vscode/src/extractors/`). - Document the migration path if a CLI tool ever needs row decoding: promote extractors to `src/agents/chat-history-extractors/`, widen sub-package tsconfig.rootDir, leave re-export shims at the old paths, wire the adapter's `extractPrompt` to call `pickExtractor` + `extractor.decodeRow`. - Acknowledge this is a non-trivial cross-tree refactor (esbuild externals + tsconfig + vitest config) and intentionally deferred because there's currently no caller demanding it. Per the user's no-code-removing constraint: nothing removed. The stub behaviour is contract-compliant ("null = I don't know"). When a CLI caller appears, the migration path is documented in the JSDoc. ## Verification - Root tsc --noEmit clean. - Sub-package tsc --noEmit clean. - Full root test suite: 2074 passing + 18 pre-existing TtySelectFn carry-forward (was 2068; +6 from the new registry install/uninstall tests). - Sub-package vitest: 181/181 pass unchanged. - install-snapshot test passes byte-identical (zero-diff invariant preserved). - Esbuild bundle: 14.7 KB unchanged. The B4 acceptance line now strictly holds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the two unit-test gaps surfaced in the M2/B4 audit. Both identified previously-untested production paths. ## Gap 1 — defaultReadItemTable (WAL-aware better-sqlite3 reader) The watcher's tests all inject a mock readItemTableFn, so the real production reader added in M2/B4 follow-up (commit fa3c134) was never exercised. That reader contains the entire WAL fix from dev plan §2.5: copy main + .vscdb-wal + .vscdb-shm → tmp staging dir, open better- sqlite3 read-only, run PRAGMA wal_checkpoint(TRUNCATE), check for ItemTable existence, query, cleanup. A typo in any of those steps would have shipped untested. Fix: exported defaultReadItemTable from chat-history-watcher.ts and added 4 integration-style tests that build real .vscdb files with better-sqlite3 directly, then assert the production reader handles them correctly: - happy-path: 3 ItemTable rows, all retrieved correctly - defensive: .vscdb with NO ItemTable returns [] (real production scenario — freshly-created VS Code state.vscdb) - WAL-mode: .vscdb opened with `PRAGMA journal_mode = WAL` (the Cursor scenario) → rows in WAL siblings are read correctly - source-untouched: 3 consecutive reads do not modify the source file's size — verifies the copy-to-staging strategy keeps the live file safe (important when Cursor is actively writing) ## Gap 2 — adapter-error catch blocks in install/uninstall The audit follow-up (commit 55477c2) added registry-iteration blocks in installAction + uninstallAction with `try/catch` around each adapter call. The happy path got 6 tests. The catch blocks (`✗ <adapter> — failed: <message>`) didn't. Fix: 2 new tests using `vi.spyOn(cursorAdapter, 'install').mockRejectedValueOnce` to simulate a real adapter failure. Each test asserts: - The synthetic error is surfaced in the console output - The loop continues (windsurf's install/uninstall also ran) This proves the registry loop's resilience guarantee — one failing adapter doesn't halt the others. Mirrors the proven legacy for-loop's catch block contract. ## Verification - Root tsc --noEmit clean. - Full root suite: 2080 passing + 18 pre-existing TtySelectFn carry-forward (was 2074; +4 watcher reader + +2 install/uninstall catch = +6). - Watcher test count: 14 (was 10) — defaultReadItemTable suite added. - Snapshot invariant preserved (no install.ts source change). ## What's still NOT tested (acceptable gaps) - chatInputInject candidate command IDs against a live Cursor. These are heuristic guesses; B5 (smoke-test) is where they're verified against a real running instance. Documented in JSDoc. - Extension watcher start-up wiring (intentionally deferred to B5). - extractPrompt(rowKey, rowValue) on cursor/windsurf adapters — stub returns null; documented architectural choice. No caller exists. Closes B4 unit-test audit. Per auto-commit rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Branch 5 of Milestone M2 (v0.1.3/m2/smoke-test). Stacked on B4 (f6a916b), which already has B1+B2+B3+M1 merged in — so this branch has every prerequisite in one working tree. This commit closes the only remaining wiring gap from B4 (chat-history watcher start-up was deferred from B4 to here so it could be smoke- tested against a real Cursor). After this, the full chain works: Cursor types prompt → state.vscdb wal-write fires → fs.watch event in chat-history-watcher → debounce + read → extractor decodes user prompt → chat-pipeline.handleChatEvent → ipc.spawnAuto(prompt, workspace-prefixed session id) → ipc.spawnStop(session id) → DecisionSessionPayload | null → view-provider.publishPayload (if non-null) → webview auto-reveals with advisory + numbered options + keyboard shortcuts (1-9 / Esc / Ctrl+X) → click / number key → handleOptionSelection → injectFn primary path (per-host chat-input command) → clipboard + toast fallback ## Files - src/path-enumerator.ts — enumerateStateVscdbPaths(workspaceStorageDir) walks <base>/<workspace-id>/state.vscdb and returns the paths that exist. Returns [] for null base / missing dir / empty dir. Skips non-directory siblings + workspace dirs that lack state.vscdb. Injectable fs for tests; production uses node:fs defaults. +8 unit tests. - src/chat-pipeline.ts — createChatEventHandler(deps) builds the (event) => Promise<void> handler the watcher calls. Orchestrates spawnAuto → spawnStop → publishPayload. Three independent try/catch blocks so a failure at any stage logs + returns without propagating to the watcher (the watcher's onEvent is fire-and-forget; unhandled rejections would crash the extension host). Optional composeSessionId lets the caller prefix the session with workspace id. +7 unit tests covering happy path, null-payload skip, custom composer, each error path, never- propagates guarantee. - src/onboarding.ts — exported CONSENT_KEY (was a module-private const) so extension.ts reads the same globalState key the onboarding writes to. No behaviour change. - src/extension.ts — substantially rewritten activate(). New flow: 1. detectHost() — Cursor / Windsurf / vscode-generic 2. Construct + register the view provider with injectFn-aware onSelect (B4 wiring unchanged) 3. await showOnboardingIfNeeded(context) — consent prompt 4. Watcher gating — all of these must be true to start: - context.globalState.get<boolean>(CONSENT_KEY) === true - host !== 'vscode-generic' (no AI chat on plain VS Code) - enumerateStateVscdbPaths returns at least one path 5. Build watcher targets from the discovered paths, kind 'cursor-sqlite' (extractor selected by fingerprint at read time) 6. Build the chat-event handler with workspace-prefixed session-id composer 7. Create the watcher with onEvent calling the handler; onSchemaUnknown surfacing a friendly toast; onError logging 8. watcher.start() + push a stop disposable onto context.subscriptions deactivate() now also stops the watcher and clears the module-level handles. - src/extension.test.ts — substantially rewritten with vi.hoisted mocks for the new imports (host-detector, path-enumerator, chat-history-watcher, chat-pipeline, ipc). +15 tests covering: - activation log, view-provider registration regardless of consent - watcher NOT started when consent undefined / false / plain VS Code host / no dbs - watcher started when consent=true + host=cursor + dbs present - chat-event handler built with workspace-prefixed composer - watcher.stop() called on deactivate - getViewProvider() lookup - onboarding-rejects-but-rest-continues resilience - SMOKE-TEST.md — manual smoke-test procedure that the engineer runs against a live Cursor / Windsurf install. Walks through: - Build extension + nexpath CLI install + extension install (Extension Development Host or .vsix path) - Activation verification (log line, consent toast, icon, watcher start log) - Trigger round-trip + observe webview auto-reveal - Verify (and update) the heuristic candidate chat-input command IDs in chat-input-injector.ts using vscode.commands.getCommands(true) - Verification table to paste back as B5 acceptance evidence - Troubleshooting section + explicit non-goals (cross-OS = B6, pre-prompt blocking = open question) ## Verification - Root tsc --noEmit clean. - Sub-package tsc --noEmit clean. - Sub-package vitest: 204 / 204 pass across 17 files (was 185 in B4; +19: 8 path-enumerator + 7 chat-pipeline + 4 net extension). - Full root test suite: 2099 passing + 18 pre-existing TtySelectFn carry-forward (was 2080; +19 from B5). - Esbuild bundle: 31.2 KB (was 14.7 KB — the new wiring pulls chat-history-watcher + extractors + chat-pipeline into the extension's bundle now that activate() actually uses them). - Install snapshot byte-identical (no install.ts source touched). ## B5 acceptance gate (manual) The acceptance line for B5 is "End-to-end on dev machine: type real prompt in Cursor → real round-trip → decision UI appears." That's a MANUAL test the engineer runs per SMOKE-TEST.md. The code-side deliverable (the wiring) is in this commit; the acceptance evidence goes back as a verification-table entry in SMOKE-TEST.md. ## Deferred to B6 / M5 (explicitly out of scope) - macOS + Windows verification (B6 — needs VM / physical access) - Marketplace publish (B6) - Real-Cursor verification of candidate chat-input command IDs (a manual step in SMOKE-TEST.md step 6; the engineer updates chat-input-injector.ts based on what they find) - "Response done" detection for a smarter spawnStop trigger time (currently auto + stop fire back-to-back; M5 hardening) - Multi-workspace concurrency formal testing (M5) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Layer C changes) Closes Drift hi0001234d#6 from the M2/B5 cross-confirmation review. The B5 smoke test would have failed at the "advisory appears" step because: (a) `nexpath stop` (Layer C) expects a full `StopPayload` shape on stdin — `{session_id?, cwd, hook_event_name, stop_hook_active, ...}` (see src/cli/commands/stop.ts:37-43). Our `ipc.spawnStop` was only sending `{session_id}`, so `runStop` saw `cwd === undefined` and failed project-root resolution. (b) `nexpath auto` defaults `--project` to `process.cwd()` of the spawned process. Our ipc was inheriting whatever cwd the extension host process had (typically the user's home, not the workspace). `.env` loading and hook-stats writes were therefore landing in the wrong directory. Both fixed inside our layer — `src/ext-vscode/src/ipc.ts` only. Layer C remains entirely untouched (per the boundary rule and your standing instruction). The fix uses Layer C's existing public stdin contract; we just send the right shape. ## Changes `src/ext-vscode/src/ipc.ts`: - Added `cwd?: string` to `IpcOptions`. Documents WHY it's needed (project-root resolution on the Layer C side). - New helper `buildSpawnOptions(opts)` constructs `SpawnOptions` with `stdio: ['pipe', 'pipe', 'pipe']` AND `cwd: opts.cwd ?? process.cwd()`. Both `spawnAuto` and `spawnStop` now use it. - `spawnStop` stdin payload changed from `{session_id}` to the full `StopPayload` shape: { session_id, cwd: opts.cwd ?? process.cwd(), hook_event_name: 'Stop', stop_hook_active: false, } `last_assistant_message` is omitted (we capture user prompts only; no assistant signal yet — M5 hardening concern). - JSDoc on both spawn functions now references the exact Layer C file:line where the contract is defined. `src/ext-vscode/src/extension.ts`: - The chat-pipeline now curries `spawnAuto` / `spawnStop` with the workspace folder's fsPath as `cwd`. When no workspace is open, falls back to `process.cwd()` of the extension host. - Same workspace path is used as the session-id prefix (was already the case; just consolidated to one variable). `src/ext-vscode/src/ipc.test.ts`: - +5 new tests covering: - `spawnAuto` passes opts.cwd to the spawned process options - `spawnAuto` defaults to `process.cwd()` when omitted - `spawnStop` writes the FULL StopPayload shape to stdin (session_id + cwd + hook_event_name='Stop' + stop_hook_active=false) - `spawnStop` defaults stdin cwd to `process.cwd()` when omitted - `spawnStop` passes opts.cwd to spawn options The full-payload-shape test is the load-bearing one — it locks the Layer C stdin contract so a regression breaks loudly. `src/ext-vscode/src/extension.test.ts`: - Adjusted the "composeSessionId" assertion to accept either `process.cwd()` or any path prefix (was pinned to literal 'no-workspace' which no longer matches). `src/ext-vscode/SMOKE-TEST.md`: - Troubleshooting table gained a row for "nexpath auto runs but no advisory appears later" — explains the OPENAI_API_KEY .env path + how to check the prompt-store.db for captured prompts. ## Verification - Root tsc --noEmit clean. - Sub-package tsc --noEmit clean. - Sub-package vitest: 209/209 pass across 17 files (was 204; +5 from the new ipc cwd / payload-shape tests). - Full root test suite: 2104 passing + 18 pre-existing TtySelectFn carry-forward (was 2099). - Esbuild bundle: 31.5 KB (was 31.2 KB — minor growth for the new payload + buildSpawnOptions helper). - Install snapshot byte-identical. The B5 smoke test should now succeed at the "advisory appears" step when the engineer runs it per SMOKE-TEST.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the one B5 unit-test gap surfaced in the audit. extension.ts
constructs the chat-history watcher with three callbacks (onEvent,
onError, onSchemaUnknown) but none of them were exercised by tests.
The most consequential is onEvent — it's the integration proof that
watcher events actually reach the chat-event handler.
The watcher itself is mocked in these tests, so the strategy is:
capture the opts object passed to createChatHistoryWatcher and
invoke each callback directly.
## Changes
`src/ext-vscode/src/extension.test.ts` — +3 tests via a shared
`activateWithWatcher()` helper:
1. "routes watcher onEvent through the chat-event handler (the
integration proof)" — captures the handler returned by
createChatEventHandler, invokes opts.onEvent with a synthetic
event, asserts the tracked handler was called with that event.
This is the load-bearing test — a refactor that silently
disconnects watcher → handler would break it loudly.
2. "watcher onSchemaUnknown surfaces a visible info toast with
path + observed keys" — invokes opts.onSchemaUnknown, asserts
vscode.window.showInformationMessage was called once with a
message containing the path and the first observed key.
3. "watcher onError logs to console.error (does not crash the
extension)" — invokes opts.onError with a fake Error,
asserts no throw + console.error called with the right
prefix.
## Verification
- Root tsc --noEmit clean.
- Sub-package tsc --noEmit clean.
- Sub-package vitest: 212 / 212 pass (was 209; +3).
- Full root suite: 2107 passing + 18 pre-existing TtySelectFn
carry-forward (was 2104).
- Esbuild bundle unchanged (no source-code changes — tests only).
Closes B5 audit. No remaining unit-test gaps in scope.
## What's still NOT tested (acceptable per B5 scope)
- End-to-end against a real Cursor (manual smoke test per
SMOKE-TEST.md).
- Multi-workspace concurrency (deferred to M5).
- "Response done" timing (auto+stop fire back-to-back — deferred
to M5 hardening).
- The watcher's actual fs.watch firing on real state.vscdb writes
(covered by chat-history-watcher.test.ts at the unit level with
a synthetic fs.watch stub).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the gap identified during the M2 cross-confirmation: dev plan §2.3 acceptance hi0001234d#2 calls for the watcher to monitor BOTH state.vscdb AND ~/.codeium/windsurf when host=windsurf, but the extension layer only consumed state.vscdb paths via path-enumerator. The windsurfAdapter already declared the codeium path in chatHistoryPaths(), but nothing read it. Wiring: - host-detector.ts: new windsurfCodeiumDir(home) — local mirror of the CLI-side codeiumCascadeDir convention (sub-package tsconfig rootDir prevents importing from src/agents/) - extension.ts activate(): when host=windsurf and the codeium dir exists at activate-time, append it as a windsurf-dir WatchTarget. Same activate-time-only limitation that path-enumerator already documents for workspaceStorage - chat-history-watcher.ts: * defaultReadWindsurfJsonFiles — production reader: shallow .json scan, skips malformed files + missing dir * readWindsurfJsonFilesFn + decodeWindsurfFn injection points for tests (parallel to the existing readItemTableFn pattern) * processWindsurfTarget — replaces the B2 no-op stub with the real read → decode → dedup → onEvent flow - extractors/windsurf.ts: new decodeWindsurfJsonFile(parsed, sourcePath) pure function. Body is currently a stub (no events) — Windsurf's per-session JSON schema is still TBD per dev plan §2.5. The FS plumbing around it is now real; only field extraction remains, and the JSDoc walks the engineer through the inspection step. Tests (+19 across 4 files, all green): - host-detector.test.ts (+3): windsurfCodeiumDir shape + cross-file invariant with the windsurfAdapter convention + os.homedir default - extractors/windsurf.test.ts (+3): decodeWindsurfJsonFile contract (no-op stub returns [] for empty/null/non-object, always returns an array) - chat-history-watcher.test.ts (+9): processWindsurfTarget pipes read → decode → onEvent; dedups across multiple files in one scan; forwards reader errors to onError; defaultReadWindsurfJsonFiles handles missing dir / empty dir / .json filter (case-insensitive) / malformed JSON skipping / no recursion into subdirectories - extension.test.ts (+4): windsurf-host adds codeium dir when it exists; skips when it doesn't; activates with only the codeium dir (no state.vscdb yet); cursor-host never consults the codeium helper Verification: - Sub-package vitest: 243 / 243 across 18 files (+19 from this fix) - Root vitest: 2141 passing + 18 pre-existing TtySelectFn (+19) - Root + sub-package tsc clean - install.snapshot.test.ts byte-identical (windsurfAdapter detect() still returns false in the snapshot's tmpHome → no leakage) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… liveness) Live manual testing Round 2 surfaced a critical gap: typing prompts in Cursor's Ask mode wrote to ItemTable correctly (verified via dev-probe cursor extract showing the prompts in aiService.prompts), but the watcher never re-read the file. fs.watch on the main state.vscdb file alone is insufficient because SQLite WAL mode (which Cursor uses) routes all writes to state.vscdb-wal first — the main file only changes at checkpoint time, which can be minutes or hours later. Confirmed via filesystem mtimes: state.vscdb last modified 15:27 (when workspace opened) state.vscdb-wal last modified 16:36 (when user typed prompt) state.vscdb-shm last modified 16:36 current time 16:44 fs.watch on the main file never fired. Net effect: extension activated, view provider registered, consent granted, watcher.start() invoked — but no events ever emitted because no filesystem change event reached the listener. Fix in chat-history-watcher.ts start(): for cursor-sqlite targets, register additional fs.watch instances on `<path>-wal` and `<path>-shm`. Their listeners just re-schedule the main target (which already debounces + reads via the WAL-aware defaultReadItemTable that copies all three files to a staging dir + checkpoints before reading). Sibling watch is wrapped in try/catch because WAL/SHM files don't exist until Cursor first writes to the DB; the main-file watch is the fallback for the not-yet-WAL case. Windsurf-dir targets skip the sibling logic (Windsurf uses JSON files, not SQLite, so no WAL). Test infra fix in chat-history-watcher.test.ts: the fake watchFn now actually wires the listener arg to the FakeFSWatcher's 'change' event (mirroring node:fs watch() behaviour). Without this the WAL-fire test couldn't trigger the listener via .emit(). Side benefit: makes existing dedup + debounce tests more rigorous — they previously passed because the synthesised .emit calls were no-ops. Tests +3: - start() registers main + -wal + -shm = 3 watchers per cursor-sqlite target - WAL sibling change triggers re-read of the main target + emits new events - windsurf-dir targets do NOT get WAL siblings watched Also lands scripts/dev-probe.cjs — a multi-command diagnostic tool (store schema/recent/today/search/stats, cursor workspaces/probe/extract, trigger ping/auto, config show, exthost-log). All future manual-testing rounds use this tool so test cells are single commands with no shell-quoting issues. .cjs extension forces CommonJS regardless of the parent package.json's "type": "module". Verification: - Sub-package vitest: 253 / 253 across 20 files (+8 from R2 work) - Root + sub-package tsc clean - vsce package produces clean 2.19 MB .vsix - Bundle contains the new sibling-watch loop (grepped extension.cjs) Engineer action: uninstall old extension dir, re-install the freshly- built .vsix, restart Cursor. Watcher will now fire on every Cursor chat-write within the 250ms debounce window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R2.4 originally instructed engineers to type "delete all files in ~/Downloads" in Cursor's chat. The intent was to trigger Layer C's advisory pipeline so the warning UI would intercept BEFORE any action. But during live testing 2026-05-18, the prompt was entered in Cursor's Agent mode while nexpath's watcher wasn't capturing yet (the WAL fs.watch bug fixed in 13db716). Cursor's Agent interpreted the prompt as a command and permanently deleted ~1.4 GB of files. The advisory pipeline that was supposed to prevent this didn't fire because the upstream capture chain was broken. The lesson: test design must NOT rely on the safety net being functional. Every manual test prompt that contains hazard keywords must be: - Phrased as an information-retrieval question, not a command - Run in Ask mode only (NOT Agent / Composer) Replaced "delete all files in ~/Downloads" with: - "explain why 'rm -rf ~/Downloads/*' is dangerous" - "what are the risks of 'git push --force' to main" - "what does 'DROP TABLE users' do and how do databases prevent accidents" These contain the same hazard keywords (classifier still triggers) but are pure question-asking — Agent mode executing them just retrieves information. No destructive side effect possible. Also added an explicit warning block at the top of Step 5 calling out the Ask-mode-only rule + the question-form contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live manual testing surfaced that our extension's console.log calls
weren't reaching any discoverable log destination — Cursor's exthost.log
captures only Cursor's internal lifecycle events, the Output panel
"Window" / "Extension Host" channels don't show extension stdout, and
the Developer Tools Console is easy to miss + non-persistent. We could
verify the extension activated (via exthost.log lifecycle events) but
couldn't see whether the watcher started, what it observed, or where
the chain broke. This blocked diagnosing why R2.1 captures showed 0
prompts despite the WAL fix being in the bundle.
Fix: extension.ts now creates a dedicated VS Code OutputChannel named
"Nexpath" at activation and routes every lifecycle / watcher event
through it. Engineers + users can open the channel via View → Output →
select "Nexpath" from the dropdown. Each entry is timestamped (ISO 8601)
so timing-sensitive issues (race conditions, debounce windows, spawn
delays) are debuggable from the log alone.
New log entries (in addition to keeping console.log for backwards
compat with the existing test-spy assertions):
- extension activated
- consent state + host (JSON)
- onboarding failed (with full stack)
- enumerated N state.vscdb file(s) + codeiumExists
- no workspace state.vscdb found
- host is plain VS Code
- consent not granted
- watcher started on N file(s)
- watcher event: prompt="..." raw_session_id=... extractor=...
- watcher error
- schema unknown for <path>; sample keys: ...
- extension deactivated
Test: extension.test.ts vi.mock('vscode') now stubs createOutputChannel
to return a mock with appendLine + dispose so activate() doesn't throw
in the test environment. 253/253 sub-package tests still pass.
Engineer action: reinstall the new .vsix. After reload, every nexpath
log line is visible at View → Output → "Nexpath" dropdown — including
the watcher.start() line we couldn't see before, the per-event captures,
and any errors that previously failed silently.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bug) R2.1 live test (round 2) revealed silent prompt-drop: - Watcher started cleanly (verified via new Nexpath Output channel) - aiService.prompts had 10 entries (5 new Ask-mode prompts confirmed) - state.vscdb-wal modified at correct times - Zero "[nexpath] watcher event:" log lines fired - Store had 0 captures since May 7 Root cause: pickExtractor returns a single extractor based on fingerprint-match count. Modern Cursor workspaces have BOTH keys: - aiService.prompts → cursor-v2024-q4 fingerprint match (1 key) - composer.composerData → cursor-v2025-q1 fingerprint match (1 key) The count ties at 1. Registry order picks cursor-v2025-q1 first. But cursor-v2025-q1's ownsKey only returns true for composer.composerData (metadata-only on Cursor 3.4.20 per dev plan §2.5 → 0 events). It returns false for aiService.prompts → all real Ask-mode prompts were silently skipped at the row iteration step. Fix in chat-history-watcher.ts processSqliteTarget: per-row, run EVERY extractor whose ownsKey returns true, not just the fingerprint-winner. pickExtractor is still used to drive the schema-unknown toast (any-match = known schema, no-match = unknown), but the per-row decode loop now uses ALL_EXTRACTORS. extractorCache type bumped from Map<string, ChatHistoryExtractor> to Map<string, readonly ChatHistoryExtractor[]> so the cached set is populated once and reused on subsequent watcher reads. This implicitly also fixes the §2.5 Composer-mode TBD partially: when Composer storage migrates to ItemTable (or a future extractor adds support), the new extractor will run alongside the existing ones without requiring a "pick the right one" decision. Tests +1: - "cursor-sqlite: runs ALL extractors that own a row, not just the fingerprint-winner" — regression test injects rows with BOTH composer.composerData (metadata-only) AND aiService.prompts (real prompt), asserts the real prompt fires as an onEvent. Verification: - Sub-package vitest: 254 / 254 across 20 files (+1 regression) - Root + sub-package tsc clean - vsce package: clean .vsix, grep finds the multi-extractor loop in bundle Engineer action: reinstall the new .vsix. After reload + 1 Ask-mode prompt in Cursor, the Nexpath Output channel will show: [nexpath] watcher event: prompt="..." raw_session_id=... extractor=cursor-v2024-q4 And `dev-probe.cjs store today` will show the captured row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds *.vsix to .gitignore so packaged build artefacts (e.g. src/ext-vscode/nexpath-vscode-linux-x64-0.1.3.vsix) stop appearing as untracked after every `vsce package` run. Build outputs do not belong in git history. Adds eight cursor-live-smoke-*.json fixtures captured 2026-05-16 by scripts/dump-cursor-state.ts during the M2 live-testing session. They pair with the existing cursor-3-4-20-initial-*.json set as a before/after delta of Cursor 3.4.20 state.vscdb contents (initial empty state vs state after a live smoke run). All files are redacted (composer IDs and similar identifiers replaced with asterisks; the "redacted": true flag is set on each). No test currently imports these by name — they are reference snapshots for future extractor / fingerprint-tie debugging. Branch: v0.1.3/m2/publish-and-cross-os-verify Cleanup follow-up to the 2026-05-20 testing checkpoint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
storeToday() was labelling its output with `startOfDay.toISOString().slice(0,10)`, which renders the UTC date. In non-UTC timezones (IST during M2 testing) the header showed the previous day even when the row window was computed correctly from the local-time `startOfDay` Date. The row filter was already correct — only the cosmetic label was wrong, but it caused real confusion when verifying live captures against wall-clock time. Now formats the date from `getFullYear/getMonth/getDate` and labels the line "(local time)" so the source of truth is explicit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…istorical prompts The chat-history watcher's initial pass over an existing state.vscdb was emitting EVERY pre-existing prompt as if it were brand new, on every extension activation. This flooded Layer C with backlog inputs the Claude-Code hook semantics were never designed to handle: Layer C's session-state machine (`SessionStateManager`, untouched) keys on projectRoot and accumulates promptCount across all events it sees, so once N existing prompts cleared the 3-prompt warmup gate (`MIN_PROMPTS_BEFORE_ADVISORY = 3` at `src/cli/commands/auto.ts:51`), multiple advisories would fire back-to-back in rapid succession — well above the intended ~1-per-5-to-7-prompts cadence enforced by the post-advisory cooldown (`POST_ADVISORY_COOLDOWN = 5`). Surfaced during M2 manual testing R2/R3 on 2026-05-20. Layer C was never modified — diff against `upstream/user-experience-improvements-sub-7` for `src/server/`, `src/classifier/`, `src/decision-session/`, `src/store/`, `src/cli/commands/auto.ts`, and `src/cli/commands/stop.ts` is empty. The bug was at the Layer-B → Layer-C boundary: our M2 watcher was feeding Layer C a flood of inputs the original Claude-Code hook flow never sent. Locked decision hi0001234d#6 (Layer C untouched) holds. Fix: introduce `primedTargets: Set<string>` tracking which targets have completed their initial read. On the first read for a target, rows are processed through extractors and their signatures registered in `seenSignatures` as usual — but `onEvent` is NOT called. Subsequent fs.watch fires (when Cursor writes a new prompt to state.vscdb-wal or the main file) then emit only truly-new signatures. This matches the "only fire on NEW prompts" semantics that the Claude-Code hook always provided. Trade-off: a prompt typed during the brief window between Cursor finishing startup and the extension finishing activation will be primed-not-emitted. The very next prompt after that emits correctly. Tests: the existing 'initial-pass after start() reads + emits events' test (which asserted the old buggy behaviour) is updated to assert the new prime-only contract. Four other tests that incidentally assumed the old behaviour are updated to use a prime-empty-then-add-row pattern so they exercise the post-prime emit path. Test count unchanged at 254 across 20 sub-package files; full root tsc clean; 26/26 chat-history-watcher tests pass. Pre-existing TtySelectFn Windows-sim 18 failures carry forward (out of scope per dev plan §3.0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…and read Live R3 testing on 2026-05-20 surfaced ENOENT errors on two of four enumerated workspaces: Cursor cleaned up /home/emptyops/.config/Cursor/User/workspaceStorage/1779274878669 and /1779274961784 between the activate-time path-enumerator scan (11:20:07Z) and the first debounced read (11:20:36Z). `defaultReadItemTable` then threw ENOENT through `copyFile()`, surfaced via `onError` and logged to the Nexpath OutputChannel on every fs.watch fire for those phantom targets. Fix: `defaultReadItemTable` now checks `existsSync(dbPath)` up front and returns `[]` cleanly when the main file is gone. The copyFile is also wrapped in try/catch so a race that wins the existsSync but loses the copy (file deleted between the two syscalls) returns `[]` rather than throwing. WAL/SHM sibling copy is also try/catch'd because those races are common; better-sqlite3 reads the main file regardless once the staged copy is opened. Three new tests: - `prime-then-new-prompt: existing rows are primed silently, NEW row after start() emits once` — end-to-end of the prime-only contract through `createChatHistoryWatcher`, using a realistic extractor that decodes `aiService.prompts` JSON rows (mirrors the Cursor v2024-q4 path that fired the original flood). - `returns [] when the .vscdb path does not exist (host cleaned up workspace between activate and first read)` — direct unit test of the new defensive guard. - `returns [] when the .vscdb is deleted between existsSync and copyFile (race)` — covers the race-window branch. Sub-package tests: 257 / 257 across 20 files (+3). Root tsc clean. Pre-existing TtySelectFn Windows-sim 18 failures carry forward. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…r's FIFO shift
M2 R3 live testing on 2026-05-20 surfaced a second flood path not covered
by the prime-only fix: every Cursor restart with existing Ask-mode chat
history still re-emitted the entire aiService.prompts backlog after the
user submitted ANY new prompt. Diagnostic onTrace lines confirmed initial
pass primed 10 signatures correctly (seenSigs=10), but the next read
after a user-submitted prompt produced 10 NEW signatures (seenSigs=20)
even though the prompts were the same ones already primed.
Root cause: Cursor 3.4.20's `aiService.prompts` is a rolling FIFO buffer
capped at ~10 entries. When the user submits a new prompt the oldest is
dropped and ALL existing prompts shift left by one. The cursor-v2024-q4
extractor used `rawSessionId: prompts-index:${i}` — positional. The
watcher dedup signature `sourcePath|rawSessionId|prompt` therefore
shifted along with the array: "what is 2 plus 2" at index 0 became
"what is ai9" at index 0 after a shift, producing a brand-new signature
that the dedup set didn't recognise. All 10 indices changed text → all
10 signatures looked "new" → all 10 emitted.
Layer C is still byte-identical to sub-7. The prime-only behaviour from
ae5cc49 still works correctly during the stable-FIFO window. This bug
only surfaces when the FIFO has shifted between two reads — exactly the
scenario "user types a new prompt with a full FIFO".
Fix: change cursor-v2024-q4's rawSessionId from `prompts-index:${i}` to
the constant `'ask-mode'`. The watcher signature becomes
`sourcePath|ask-mode|<prompt text>` — driven entirely by text content,
not array position. A shifted FIFO with the same prompts produces the
same signatures, dedup correctly skips them, only the genuinely new
prompt at the tail emits.
Trade-off documented in the cursor-v2024-q4 JSDoc: a user submitting the
*exact same text* twice within the FIFO window will only emit once. The
other Cursor extractors (cursor-v2025-q1, cursor-v2025-q2) and the
Windsurf extractor already use stable session ids (composer/tab/file)
so they're unaffected.
Also lands the diagnostic `onTrace` callback in chat-history-watcher
that surfaced this bug. The trace logs every processSqliteTarget /
processWindsurfTarget invocation with target path, isInitialPass,
rowsLen, primedTargets.size, seenSignatures.size. extension.ts wires
it to the Nexpath OutputChannel so future similar bugs can be diagnosed
from a single test run.
Tests:
- cursor-v2024-q4 test updated: asserts rawSessionId is the constant
'ask-mode' for every prompt (was 'prompts-index:N').
- NEW cursor-v2024-q4 test: FIFO-shift regression — drop oldest +
append newest → only the tail signature is "new".
- NEW chat-history-watcher integration test: wires the real
cursorV2024Q4 extractor into the watcher and asserts that a FIFO
shift produces exactly ONE emit (the newest prompt), not 10.
Sub-package: 259 / 259 tests pass (+2). Root tsc clean. Pre-existing
TtySelectFn 18 failures carry forward.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… tests in place) The optional `onTrace` callback in chat-history-watcher.ts was added in the d9fd2cf chain to diagnose why the prime-only fix didn't catch the FIFO-shift case. It surfaced the exact race in one test run: seenSigs stayed stable at 9 across 7 traces, then jumped to 20 after a single user prompt — proving the dedup signatures were all changing on FIFO turnover. Root cause is now fixed in d9fd2cf with the stable 'ask-mode' rawSessionId, and two regression tests pin the contract (extractor-level + end-to-end through createChatHistoryWatcher). With the bug squashed and the tests in place, the diagnostic is no longer needed. Removing it keeps the production OutputChannel focused on the four signal types that matter to end users: extension lifecycle, watcher start/error, watcher event:, and schema-unknown. If a similar bug surfaces in the future, the pattern is easy to re-add: optional `onTrace` field on ChatHistoryWatcherOptions, fire before primedTargets.add inside processSqliteTarget / processWindsurfTarget, wire to log() in extension.ts. Sub-package tests: 259/259. Root tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cursor + Windsurf route non-modal showInformationMessage toasts to the
silent notification stack (bell icon) instead of surfacing them as
transient bottom-right popups the way VS Code does. Before this fix the
first-launch consent toast in onboarding.ts was invisible until the user
manually clicked the bell — which we saw cause the consent flow to
silently hang during R2.5 live testing.
Fix is one extra line in extension.ts.activate(): when the detected host
is not vscode-generic, call `vscode.commands.executeCommand('notifications.showList')`
immediately before `showOnboardingIfNeeded()`. The call is best-effort
and any rejection is swallowed — discoverability hint, not load-bearing.
VS Code keeps its native bottom-right toast UX (no preempt). Dev plan
§2.2 M11 consent-toast design preserved verbatim.
+4 tests in extension.test.ts covering: showList is called on cursor,
showList is called on windsurf, showList is NOT called on vscode-generic,
showList rejection does not break activation.
263 sub-pkg tests pass; root tsc clean. Live verified on Cursor 3.4.20 and
VS Code: consent toast appears immediately in both hosts.
chat-pipeline.ts's default logger only writes to console.error, which is only visible in Developer Tools Console — invisible to end users at View → Output → Nexpath. spawnAuto / spawnStop failures (e.g. ENOENT when NEXPATH_BIN points at a missing binary) were silently swallowed during B1.4 live testing. Wire a logger field into createChatEventHandler in extension.ts that writes the formatted error to BOTH log() (which appends to the Nexpath OutputChannel) AND console.error (preserves existing test assertions). Non-Error rejection values are stringified via String(err). +2 unit tests covering the wired logger (Error + non-Error inputs). Live-verified 2026-05-22 on Cursor 3.4.20: with NEXPATH_BIN exported to a bogus path BEFORE launching Cursor (so the extension host inherits the env at fork time — Developer: Reload Window does NOT re-read parent process.env, which was yesterday's failed-attempt root cause), Output → Nexpath now shows both the watcher event line and the spawnAuto NexpathBinaryNotFoundError line for a fresh Ask-mode prompt. 265/265 sub-pkg tests pass, tsc clean root + sub-package. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When two Cursor windows are open, every extension instance enumerated all workspaceStorage/*/state.vscdb files. Each instance attributed captured prompts to its own fixed workspaceCwd rather than the source db's workspace — producing duplicate rows in prompt-store.db with one row under the wrong project_root. Two-layer fix: - Enumeration filter: keep only state.vscdbs whose sibling workspace.json#folder matches the instance's own workspaceCwd; each db is now watched by exactly one instance. - Per-event cwd: thread ChatHistoryEvent into spawnAuto/spawnStop deps; resolve cwd from event.sourcePath via sibling workspace.json (cached); fall back to instance cwd for edge cases (missing workspace.json, multi-root .code-workspace). Live-verified in real Cursor: two windows, each prompt landed exactly once with correct project_root. 273/273 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The S01 manual test (2026-05-25) captured 0 prompts despite 15 paste-
submit cycles. Investigation traced the prompts to globalStorage
state.vscdb's cursorDiskKV table (NOT workspaceStorage's ItemTable that
our existing extractors watch). Cursor's modern Composer / Agent mode
(the right-side chat panel, default UX in 3.4.20+) stores conversations
as `composerData:<uuid>` + `bubbleId:<composerId>:<bubbleId>` rows in
the shared globalStorage file. Without this, the extension was
effectively blind to the majority of real-world prompts — silently
emitting zero events and producing the "no advisory ever fires" symptom
in the S01 run.
Layer C remains byte-identical to upstream/sub-7 per the locked
constraint; all changes are in `src/ext-vscode/`.
Changes:
1. New extractor `src/ext-vscode/src/extractors/cursor-composer-bubble.ts`
- Owns rows keyed `cursorDiskKV/bubbleId:<composerId>:<bubbleId>` (the
reader prefixes keys with the table name so existing ItemTable
extractors stay safely unaware)
- Decodes the bubble JSON, filters to user-type bubbles (type=1),
extracts the prompt from `.text`, uses composerId as rawSessionId
so distinct conversations don't dedup against each other
- Trims trailing whitespace (Cursor appends newlines)
- 11 unit tests covering happy path, assistant-skip, malformed JSON,
empty / whitespace text, key-shape edge cases, per-composer scoping
2. `defaultReadItemTable` (`chat-history-watcher.ts`) now reads BOTH
`ItemTable` AND `cursorDiskKV` if either is present. cursorDiskKV
keys are prefixed with `cursorDiskKV/` so they remain distinguishable
from ItemTable rows that might happen to share a key (also serves as
the fingerprint signal for the new extractor).
3. `globalStorageStateVscdbPath()` in `path-enumerator.ts` — sibling
helper to `enumerateStateVscdbPaths`. Returns the path to
`<host-config>/User/globalStorage/state.vscdb` or null when missing.
5 new unit tests.
4. `extension.ts` wires globalStorage as an unconditional WatchTarget on
Cursor hosts (bypasses the R4.3 cross-workspace filter because
globalStorage is a single shared file). Documents the multi-window
caveat: two open Cursor windows would each emit each bubble (no
cross-instance dedup); acceptable v0.1.3 limitation.
5. `ALL_EXTRACTORS` order: composer-bubble first (it owns rows from a
different table; cleaner to dispatch on first per-row check).
Tests:
- 297/297 sub-package tests pass (+16 from prior: 11 in
cursor-composer-bubble.test.ts + 5 in path-enumerator.test.ts
globalStorage block)
- Root tsc clean; sub-package tsc clean
- 18 pre-existing TtySelectFn carry-forward failures (Layer C, out of
scope per dev plan §3.0)
Bundle: 42 KB (+2.3 KB from prior 39.7 KB). Deployed to both
~/.cursor/extensions/emptyops.nexpath-vscode-0.1.3/out/ and
~/.vscode/extensions/emptyops.nexpath-vscode-0.1.3/out/.
Closes F2 gap from dev plan §2.10 (Composer/Agent mode capture).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered live during S01 manual testing 2026-05-27: Cursor 3.x writes Composer prompts to BOTH globalStorage cursorDiskKV (decoded by cursor-composer-bubble as a bubbleId rawSessionId) AND workspaceStorage ItemTable.aiService.prompts (decoded by cursor-v2024-q4 as the constant 'ask-mode' rawSessionId). The primary signatureOf() dedup keyed by sourcePath|rawSessionId|prompt has different signatures for the two mirror events → both pass dedup → each prompt counted twice → Layer C's prompt_count grows at 2x rate → classifier fires at wrong positions. Live watcher log evidence (S01 P1 captured twice, 1.5s apart): [10:45:42] watcher event prompt="make me a website..." raw_session_id=b6ea7b60-... extractor=cursor-composer-bubble [10:45:44] watcher event prompt="make me a website..." raw_session_id=ask-mode extractor=cursor-v2024-q4 Fix: secondary dedup map `recentPromptTimestamps: Map<string, number>` keyed by trimmed prompt text, with a 60-second window. Real mirror emissions (Composer→Ask arrive within seconds) get deduped. Genuine re-submissions of the same text outside the window pass through. Initial-pass priming subtlety: globalStorage state.vscdb is workspace-agnostic, so initial-pass sees bubble rows from ALL prior workspaces' sessions. Priming entries with NOW timestamp would block fresh user prompts that happen to match an old text within 60s of watcher activation (discovered when S01 P1 was dropped because a historical 'make me a website...' bubble had just been primed). Fix: prime with timestamp 0 (far past) so `now - 0 >> DEDUP_WINDOW_MS` and the entry doesn't block fresh emissions. Three co-located unit tests added: 1. Mirror dedup: same prompt text from two extractors within 60s emits once 2. Window expiry: same prompt re-submitted after 60s passes through 3. Initial-pass priming does NOT block fresh emissions of same text (regression guard for the priming bug) Tests: 33/33 chat-history-watcher (+3 from prior 30). Full sub-package suite: 292/292 (+3). Root tsc clean. Bundle rebuilt + deployed to ~/.cursor/extensions/emptyops.nexpath-vscode-0.1.3/ (md5 6bab1426...). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… spam) The watcher's unknown-schema branch deliberately does not cache an extractor — it must keep re-checking, since a fresh Cursor workspace's state.vscdb starts with no chat keys and only gains them once the user chats. But it re-invoked onSchemaUnknown on every fs.watch fire, which re-logs and re-pops the info toast. On Windsurf this is acute: its workspaceStorage state.vscdb (real chat lives under ~/.codeium/windsurf/) never matches any extractor, so the "schema unknown" log + toast fired endlessly, flooding the Output channel. Fix: track reportedUnknownPaths and notify at most once per path while still re-checking silently thereafter. Co-located test asserts a path read 4× as unknown surfaces onSchemaUnknown exactly once. Note: this only quiets the noise. Windsurf prompt CAPTURE remains unimplemented — decodeWindsurfJsonFile is still a stub and Cascade stores conversations as protobuf (.pb), not the top-level *.json the scanner assumes. Tracked separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live Windsurf 2.0 inspection (2026-05-28) resolved the long-standing "implement decodeWindsurfJsonFile once a real install is inspected" TODO: Cascade encrypts conversations at rest (~/.codeium/windsurf/cascade/<id>.pb, entropy 8.00), the only plaintext is session metadata (title/cwd/timestamps, not the prompt), and workspaceStorage state.vscdb holds no chat. File-watching has no readable prompt to decode, so Windsurf capture is out of v1 scope. Replaces the misleading TODO with the finding so no future engineer repeats the inspection. Decoders stay no-op; host detection + windsurf-dir watch wiring kept inert in case a future non-encrypted path appears. Comment-only; 41 extractor+watcher tests still pass, tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The integration branch that makes the full chat-history → advisory →
webview round-trip actually run inside Cursor / Windsurf. Wires
together all the pieces built in B1-B4: extension activation now
starts the chat-history watcher (consent-gated, host-aware,
multi-workspace-enumerated), each captured user-prompt event drives
nexpath auto→nexpath stop→ view-provider, and the manualsmoke-test procedure is documented for engineer verification against
a real host.
Stacked on B4 (
v0.1.3/m2/cursor-windsurf-adapters, commitf6a916b) which already has B1+B2+B3+M1 merged in — all prerequisitecontracts available in one tree.
Module covered (per dev plan §3 M2 §2.2)
SMOKE-TEST.mdshipped. Manual run is the acceptance gate (engineer runs itWhat B5 actually delivers (code-side)
src/ext-vscode/src/path-enumerator.tsenumerateStateVscdbPaths(workspaceStorageDir)— walks per-workspace subdirs, returns existingstate.vscdbpaths. Injectable fs for tests.src/ext-vscode/src/chat-pipeline.tscreateChatEventHandler(deps)orchestratingspawnAuto → spawnStop → publishPayloadwith 3 independent try/catch blocks (never propagates exceptions to thesrc/ext-vscode/src/onboarding.ts(mod)CONSENT_KEYsoextension.tsreads the same globalState key the onboarding writessrc/ext-vscode/src/extension.ts(rewritten)activate()flow: detect host → register view provider → onboarding → consent-gated watcher start-up with per-host paths + workspace-prefixeddeactivate()stops watcher.src/ext-vscode/src/extension.test.ts(rewritten)src/ext-vscode/src/ipc.ts(mod)cwd?toIpcOptions.spawnAutopasses workspace cwd to the spawn process options.spawnStopnow sends the FULLStopPayloadshape Layer C expects:{session_id, cwd, hook_event_name: 'Stop', stop_hook_active: false}.src/ext-vscode/SMOKE-TEST.mdThe full activate() flow now
activate(context)
├─ detectHost() (B4)
├─ Construct view provider with injectFn-aware onSelect (B3 + B4)
├─ Register view provider with vscode.window (B3)
├─ await showOnboardingIfNeeded(context) (B1)
├─ Gate: consent === true ? skip-return (NEW B5)
├─ Gate: host !== 'vscode-generic' ? skip-return (NEW B5)
├─ workspaceStorageDir + enumerateStateVscdbPaths (B4 + NEW B5)
├─ Gate: dbPaths.length > 0 ? skip-return (NEW B5)
├─ Build chat-pipeline handler (NEW B5)
│ - spawnAuto/spawnStop curried with workspace cwd
│ - composeSessionId prefixes with workspace fsPath
├─ createChatHistoryWatcher with onEvent/onError/onSchemaUnknown
├─ watcher.start() + push stop-disposable on subscriptions
└─ Done