feat: interceptor diagnose — debugging snapshot for agent diagnosis#105
feat: interceptor diagnose — debugging snapshot for agent diagnosis#105trillium wants to merge 3 commits into
Conversation
Add 'interceptor diagnose' — a new command that collapses the 4-5 follow-up calls an agent normally issues after a failure into one. Output: daemon: running (pid 12345) extension: connected tab 3: https://example.com "Example Domain" elements: 24 interactive monitor: no sessions Works without a daemon (reports local state), surfaces richer context when daemon + extension are reachable. Probes are capped at 2 s each and run in parallel so the command stays fast even on a heavy page. --json emits a structured DiagnoseSnapshot for programmatic use.
📝 WalkthroughWalkthroughAdds a ChangesDiagnose CLI Command
Daemon lock-file and lifecycle
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cli/commands/diagnose.ts`:
- Around line 24-29: The probeWithTimeout function schedules a timeout that is
never cleared, leaving the timer running and causing artificial tail latency;
modify probeWithTimeout to store the timeout id from setTimeout and clear it
(clearTimeout) as soon as either fn() resolves or rejects so the timer doesn't
keep the process alive—update the Promise.race wrapper around fn() and the
timeout to ensure the timeout is cleared when fn wins (and also clear the
timeout if the timeout path runs), adjusting types for the timeout id
(NodeJS.Timeout vs number) if needed; locate and change the probeWithTimeout
function to implement this cleanup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b446d9ff-ec71-4480-a557-9034f81b8f46
📒 Files selected for processing (3)
cli/commands/diagnose.tscli/help.tscli/index.ts
Two fixes:
1. probeWithTimeout: clear the timer in finally{} so it doesn't keep
the process alive after fn() resolves. CodeRabbit catch on PR Hacker-Valley-Media#105.
2. Context-awareness (brain-svjg): without --context, enumerate ALL
connected browser contexts via the 'contexts' daemon action and
probe each one in parallel. In a dual-browser setup (Chrome + Brave)
both contexts appear side-by-side — the silent mismatch that made
the previous version misleading is now visible.
With --context <id>: probe only that context (for targeted use).
Single 'default' context: output format unchanged (flat, no header).
Multiple or named contexts: indented under 'context <id>:' blocks.
--json output promotes tab/extension/elements into a 'contexts'
array so callers can diff contexts programmatically.
Lock file (daemon/lifecycle.ts, shared/platform.ts, daemon/index.ts):
- New LockFileData type: pid, version, execPath, startedAt, socketPath, wsPort, mode
- writeLockFile() / readLockFile() / clearLockFile() helpers
- checkLockFileDuplicate() exits early when a live duplicate daemon is found
- Daemon writes lock on startup, clears on SIGINT/SIGTERM/SIGHUP
- LOCK_PATH exported from shared/platform.ts ($INTERCEPTOR_LOCK_PATH override)
Binary mismatch detection (cli/commands/diagnose.ts):
- Reads lock file to get the execPath of the daemon that owns the socket
- Reads each browser's NMH manifest to get the path Chrome/Brave will spawn
- When they differ, surfaces a prominent warning immediately after the daemon line
This is the exact signal that was missing in the incident where 'interceptor
tabs' timed out despite Chrome being open: the socket daemon was the Homebrew
binary but the NMH manifest pointed to the dev binary, so the extension and
CLI were talking to different processes.
Output when mismatch detected:
daemon: running (pid 12345, /opt/homebrew/bin/interceptor-daemon)
⚠ binary mismatch (chrome):
socket daemon: /opt/homebrew/bin/interceptor-daemon
NMH manifest: /Users/trillium/Projects/interceptor/daemon/interceptor-daemon
Chrome will spawn the manifest binary; CLI talks to the socket binary.
Fix: run 'interceptor init' or update the NMH manifest to match.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
daemon/lifecycle.ts (1)
30-36: 💤 Low valueConsider adding minimal schema validation to
readLockFile.The function parses JSON and type-asserts it as
LockFileDatawithout validating that required fields exist. If the lock file is corrupted or contains valid JSON with a different shape, consumers (likecli/commands/diagnose.tsaccessinglock.execPath,lock.startedAt) could receiveundefinedfor expected fields.Given the lock file is written by this same codebase and read shortly after, the risk is low. However, a minimal check could improve robustness.
♻️ Optional: Add minimal field validation
export function readLockFile(lockPath: string): LockFileData | null { try { - return JSON.parse(readFileSync(lockPath, "utf-8")) as LockFileData + const data = JSON.parse(readFileSync(lockPath, "utf-8")) + if (typeof data?.pid !== "number" || typeof data?.execPath !== "string") { + return null + } + return data as LockFileData } catch { return null } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@daemon/lifecycle.ts` around lines 30 - 36, readLockFile currently JSON-parses and type-asserts to LockFileData without verifying required fields, which can yield undefined for consumers (e.g., cli/commands/diagnose.ts reading lock.execPath or lock.startedAt); modify readLockFile to perform minimal schema validation after parsing: ensure the resulting object is non-null, has the expected keys (at least execPath and startedAt) and proper primitive types (string/number), and return null if validation fails; reference the readLockFile function and LockFileData type and update callers to handle a null result safely if not already.cli/commands/diagnose.ts (2)
73-80: 💤 Low valueConsider runtime validation of manifest structure.
The type assertion
as { path?: string }doesn't verify the actual JSON shape at runtime. If the manifest has unexpected structure (e.g.,pathis not a string), the function silently returnsnullrather than reporting the issue.For a diagnostic tool where surfacing configuration problems is valuable, consider validating the parsed JSON:
♻️ Optional: Add runtime validation
function readNmhManifestPath(manifestFile: string): string | null { try { - const manifest = JSON.parse(readFileSync(manifestFile, "utf-8")) as { path?: string } - return manifest.path ?? null + const manifest = JSON.parse(readFileSync(manifestFile, "utf-8")) + if (typeof manifest === "object" && manifest !== null && typeof manifest.path === "string") { + return manifest.path + } + return null } catch { return null } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cli/commands/diagnose.ts` around lines 73 - 80, The function readNmhManifestPath currently asserts the JSON shape but doesn't validate it; update it to perform runtime validation after parsing: parse manifest into an unknown, assert it's an object, check that manifest.path exists and is of type string (e.g. typeof manifest.path === "string"), and if not, surface a clear diagnostic (throw or return an Error/diagnostic message) rather than silently returning null; update error handling in readNmhManifestPath to differentiate parse/file errors from invalid manifest shape and include the offending value/type in the message so callers can report configuration problems.
191-199: 💤 Low valueConsider clarifying which binary is authoritative in the mismatch message.
The fix suggestion "run 'interceptor init' or update the NMH manifest to match" doesn't indicate which binary should be considered correct. Users may not know whether to align the manifest to the running daemon or vice versa.
Consider making the guidance more specific, e.g., "
interceptor initwill update the manifest to use the CLI's current binary" or explicitly state which path is expected.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cli/commands/diagnose.ts` around lines 191 - 199, The binary mismatch messaging (in the loop over snap.binaryMismatches) should explicitly state which binary is authoritative and what 'interceptor init' does: update the NMH manifest to match the CLI/daemon binary. Update the lines.push for the Fix and/or surrounding text to mention that the socket daemon path (m.runningPath) is the expected/authoritative binary, show both paths (m.runningPath and m.manifestPath) as you're already doing, and change the Fix line to say something like "'interceptor init' will update the NMH manifest to use the CLI's current/socket binary (m.runningPath) or run the opposite if you intentionally want the manifest binary to be authoritative." Ensure you edit the messages constructed in the for loop over snap.binaryMismatches so users know which path to align.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cli/commands/diagnose.ts`:
- Around line 30-33: NMH_PATHS currently hardcodes macOS paths causing
detectBinaryMismatches to silently fail on Linux/Windows; update the code to
detect platform (process.platform) and build NMH paths using os.homedir() for
POSIX and process.env.APPDATA or process.env.USERPROFILE for Windows, providing
correct per-platform paths (macOS: Library/Application Support/..., Linux:
~/.config/.../NativeMessagingHosts, Windows:
%APPDATA%\...\NativeMessagingHosts), then ensure detectBinaryMismatches reads
from these platform-specific NMH_PATHS and gracefully handles missing env vars
by falling back to os.homedir() or throwing a clear error/log via the same
function names (NMH_PATHS, detectBinaryMismatches).
In `@daemon/index.ts`:
- Around line 514-516: Remove the early signal handlers that call
clearLockFile(LOCK_PATH) followed by process.exit(0) and instead invoke
clearLockFile(LOCK_PATH) from inside gracefulShutdown(); locate and update the
gracefulShutdown function to perform lock cleanup (call
clearLockFile(LOCK_PATH)) as part of its cleanup sequence and let
gracefulShutdown control process.exit timing, and ensure SIGINT/SIGTERM (and
optionally SIGHUP) are hooked to call gracefulShutdown() so pending requests,
servers, PID/socket cleanup and the lock file all run in the same shutdown flow.
In `@daemon/lifecycle.ts`:
- Around line 113-120: clearDaemonRuntimeFiles now calls
deps.unlinkSync(deps.lockPath) in addition to socketPath and pidPath, so the
test expecting deps.unlinked toEqual [deps.socketPath, deps.pidPath] will fail;
update the test assertion that checks deps.unlinked to include deps.lockPath
(i.e. expect deps.unlinked toEqual [deps.socketPath, deps.pidPath,
deps.lockPath]) so it matches the behavior of clearDaemonRuntimeFiles (which
references unlinkSync and lockPath).
---
Nitpick comments:
In `@cli/commands/diagnose.ts`:
- Around line 73-80: The function readNmhManifestPath currently asserts the JSON
shape but doesn't validate it; update it to perform runtime validation after
parsing: parse manifest into an unknown, assert it's an object, check that
manifest.path exists and is of type string (e.g. typeof manifest.path ===
"string"), and if not, surface a clear diagnostic (throw or return an
Error/diagnostic message) rather than silently returning null; update error
handling in readNmhManifestPath to differentiate parse/file errors from invalid
manifest shape and include the offending value/type in the message so callers
can report configuration problems.
- Around line 191-199: The binary mismatch messaging (in the loop over
snap.binaryMismatches) should explicitly state which binary is authoritative and
what 'interceptor init' does: update the NMH manifest to match the CLI/daemon
binary. Update the lines.push for the Fix and/or surrounding text to mention
that the socket daemon path (m.runningPath) is the expected/authoritative
binary, show both paths (m.runningPath and m.manifestPath) as you're already
doing, and change the Fix line to say something like "'interceptor init' will
update the NMH manifest to use the CLI's current/socket binary (m.runningPath)
or run the opposite if you intentionally want the manifest binary to be
authoritative." Ensure you edit the messages constructed in the for loop over
snap.binaryMismatches so users know which path to align.
In `@daemon/lifecycle.ts`:
- Around line 30-36: readLockFile currently JSON-parses and type-asserts to
LockFileData without verifying required fields, which can yield undefined for
consumers (e.g., cli/commands/diagnose.ts reading lock.execPath or
lock.startedAt); modify readLockFile to perform minimal schema validation after
parsing: ensure the resulting object is non-null, has the expected keys (at
least execPath and startedAt) and proper primitive types (string/number), and
return null if validation fails; reference the readLockFile function and
LockFileData type and update callers to handle a null result safely if not
already.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1c14bac4-58cf-4b11-aad4-517a6823afc3
📒 Files selected for processing (5)
cli/commands/diagnose.tsdaemon/index.tsdaemon/lifecycle.tsshared/platform.tstest/daemon-lifecycle.test.ts
| const NMH_PATHS: Record<string, string> = { | ||
| chrome: `${process.env.HOME}/Library/Application Support/Google/Chrome/NativeMessagingHosts/com.interceptor.host.json`, | ||
| brave: `${process.env.HOME}/Library/Application Support/BraveSoftware/Brave-Browser/NativeMessagingHosts/com.interceptor.host.json`, | ||
| } |
There was a problem hiding this comment.
Binary mismatch detection is macOS-only.
NMH_PATHS hardcodes macOS paths using process.env.HOME, which breaks on Linux and Windows:
HOMEis undefined on Windows (useUSERPROFILEoros.homedir())- Linux stores NMH manifests in
~/.config/google-chrome/NativeMessagingHosts/and~/.config/BraveSoftware/Brave-Browser/NativeMessagingHosts/ - Windows uses
%APPDATA%\Google\Chrome\NativeMessagingHosts\and%APPDATA%\BraveSoftware\Brave-Browser\NativeMessagingHosts\
detectBinaryMismatches will silently return an empty array on non-macOS platforms, hiding critical mismatches.
🔧 Proposed fix to add platform detection
+import { homedir, platform } from "node:os"
+import { join } from "node:path"
+
-const NMH_PATHS: Record<string, string> = {
- chrome: `${process.env.HOME}/Library/Application Support/Google/Chrome/NativeMessagingHosts/com.interceptor.host.json`,
- brave: `${process.env.HOME}/Library/Application Support/BraveSoftware/Brave-Browser/NativeMessagingHosts/com.interceptor.host.json`,
+function getNmhPaths(): Record<string, string> {
+ const home = homedir()
+ const plat = platform()
+
+ if (plat === "darwin") {
+ return {
+ chrome: join(home, "Library/Application Support/Google/Chrome/NativeMessagingHosts/com.interceptor.host.json"),
+ brave: join(home, "Library/Application Support/BraveSoftware/Brave-Browser/NativeMessagingHosts/com.interceptor.host.json"),
+ }
+ } else if (plat === "linux") {
+ return {
+ chrome: join(home, ".config/google-chrome/NativeMessagingHosts/com.interceptor.host.json"),
+ brave: join(home, ".config/BraveSoftware/Brave-Browser/NativeMessagingHosts/com.interceptor.host.json"),
+ }
+ } else if (plat === "win32") {
+ const appData = process.env.APPDATA || join(home, "AppData/Roaming")
+ return {
+ chrome: join(appData, "Google/Chrome/NativeMessagingHosts/com.interceptor.host.json"),
+ brave: join(appData, "BraveSoftware/Brave-Browser/NativeMessagingHosts/com.interceptor.host.json"),
+ }
+ }
+ return {}
}
+
+const NMH_PATHS = getNmhPaths()Then update line 85:
function detectBinaryMismatches(lock: LockFileData | null): BinaryMismatch[] {
if (!lock?.execPath) return []
const mismatches: BinaryMismatch[] = []
- for (const [browser, manifestFile] of Object.entries(NMH_PATHS)) {
+ const paths = getNmhPaths()
+ for (const [browser, manifestFile] of Object.entries(paths)) {
if (!existsSync(manifestFile)) continue🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cli/commands/diagnose.ts` around lines 30 - 33, NMH_PATHS currently hardcodes
macOS paths causing detectBinaryMismatches to silently fail on Linux/Windows;
update the code to detect platform (process.platform) and build NMH paths using
os.homedir() for POSIX and process.env.APPDATA or process.env.USERPROFILE for
Windows, providing correct per-platform paths (macOS: Library/Application
Support/..., Linux: ~/.config/.../NativeMessagingHosts, Windows:
%APPDATA%\...\NativeMessagingHosts), then ensure detectBinaryMismatches reads
from these platform-specific NMH_PATHS and gracefully handles missing env vars
by falling back to os.homedir() or throwing a clear error/log via the same
function names (NMH_PATHS, detectBinaryMismatches).
| for (const sig of ["SIGINT", "SIGTERM", "SIGHUP"] as const) { | ||
| process.on(sig, () => { clearLockFile(LOCK_PATH); process.exit(0) }) | ||
| } |
There was a problem hiding this comment.
Signal handlers conflict with existing graceful shutdown logic.
These handlers call process.exit(0) immediately after clearing the lock file. However, lines 1192-1193 register gracefulShutdown() handlers for the same signals (SIGTERM, SIGINT). Since the handlers here are registered first and call process.exit(0), the graceful shutdown logic never runs:
- Pending requests won't be drained (line 1170-1175)
- Socket/WS servers won't be stopped (lines 1176-1180)
- PID file and socket won't be cleaned up by
gracefulShutdown(lines 1181-1182)
Integrate lock file cleanup into gracefulShutdown() instead.
🐛 Move lock file cleanup into gracefulShutdown
Remove the separate handlers:
-for (const sig of ["SIGINT", "SIGTERM", "SIGHUP"] as const) {
- process.on(sig, () => { clearLockFile(LOCK_PATH); process.exit(0) })
-}And add lock file cleanup to gracefulShutdown:
function gracefulShutdown(signal: string) {
log(`${signal} received, draining ${pendingRequests.size} pending requests`)
for (const [id, req] of pendingRequests) {
clearTimeout(req.timer)
socketWriteFramed(req.socket, JSON.stringify({ id, result: { success: false, error: "daemon shutting down" } }))
}
pendingRequests.clear()
if (socketServer) {
socketServer.stop(true)
socketServer = null
}
if (wsServer) wsServer.stop(true)
try { unlinkSync(SOCKET_PATH) } catch {}
try { unlinkSync(PID_PATH) } catch {}
+ clearLockFile(LOCK_PATH)
log("shutdown complete")
process.exit(0)
}For SIGHUP, add a handler to gracefulShutdown calls if desired:
process.on("SIGTERM", () => gracefulShutdown("SIGTERM"))
process.on("SIGINT", () => gracefulShutdown("SIGINT"))
+process.on("SIGHUP", () => gracefulShutdown("SIGHUP"))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@daemon/index.ts` around lines 514 - 516, Remove the early signal handlers
that call clearLockFile(LOCK_PATH) followed by process.exit(0) and instead
invoke clearLockFile(LOCK_PATH) from inside gracefulShutdown(); locate and
update the gracefulShutdown function to perform lock cleanup (call
clearLockFile(LOCK_PATH)) as part of its cleanup sequence and let
gracefulShutdown control process.exit timing, and ensure SIGINT/SIGTERM (and
optionally SIGHUP) are hooked to call gracefulShutdown() so pending requests,
servers, PID/socket cleanup and the lock file all run in the same shutdown flow.
| export function clearDaemonRuntimeFiles(deps: Pick<LifecycleDeps, "unlinkSync" | "pidPath" | "lockPath" | "socketPath" | "isWin" | "log">, reason: string): void { | ||
| deps.log(`clearing daemon runtime files: ${reason}`) | ||
| if (!deps.isWin) { | ||
| try { deps.unlinkSync(deps.socketPath) } catch {} | ||
| } | ||
| try { deps.unlinkSync(deps.pidPath) } catch {} | ||
| try { deps.unlinkSync(deps.lockPath) } catch {} | ||
| } |
There was a problem hiding this comment.
Test will fail: clearDaemonRuntimeFiles now unlinks three files but test expects two.
The test at test/daemon-lifecycle.test.ts:72 asserts:
expect(deps.unlinked).toEqual([deps.socketPath, deps.pidPath])With the addition of unlinkSync(deps.lockPath) at line 119, the actual result will be [socketPath, pidPath, lockPath], causing the test to fail.
🐛 Update the test expectation
In test/daemon-lifecycle.test.ts, update line 72:
- expect(deps.unlinked).toEqual([deps.socketPath, deps.pidPath])
+ expect(deps.unlinked).toEqual([deps.socketPath, deps.pidPath, deps.lockPath])🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@daemon/lifecycle.ts` around lines 113 - 120, clearDaemonRuntimeFiles now
calls deps.unlinkSync(deps.lockPath) in addition to socketPath and pidPath, so
the test expecting deps.unlinked toEqual [deps.socketPath, deps.pidPath] will
fail; update the test assertion that checks deps.unlinked to include
deps.lockPath (i.e. expect deps.unlinked toEqual [deps.socketPath, deps.pidPath,
deps.lockPath]) so it matches the behavior of clearDaemonRuntimeFiles (which
references unlinkSync and lockPath).
feat:
interceptor diagnose— debugging snapshot for agent diagnosisHey there, I ran into a problem where 2 interceptor processes were running and went down a troubleshooting rabbit hole 🐇🕳️. Thought this could potentially happent o someone else.
What
Adds
interceptor diagnose— a single command that surfaces everything needed to understand why Interceptor isn't working, including a new lock file mechanism and binary mismatch detection that catches the specific failure mode that started this rabbit hole.The incident this solves
The root cause was a manifest/binary split: the NMH manifest pointed to
~/Projects/interceptor/daemon/interceptor-daemon(dev build) but the running daemon on the socket was/opt/homebrew/bin/interceptor-daemon(Homebrew install). Chrome spawned the manifest binary when the extension connected; the CLI talked to the socket daemon. Two different processes, zero visibility into each other —interceptor tabstimed out for 15 seconds with no useful signal about why.interceptor diagnosenow catches this immediately:Why
diagnoseinstead ofstatusstatusis a pre-flight check — it tells you if the daemon is running and the bridge is installed.diagnoseis a post-failure tool — it tells you what's actually connected right now and why commands might be failing. They answer different questions.Full output (healthy single-browser setup)
Dual-browser setup (Chrome + Brave):
Daemon not running:
--jsonavailable for all of the above.Changes
New:
interceptor diagnose(cli/commands/diagnose.ts)execPathagainst every installed NMH manifest — flags mismatchescontexts, probes each in parallel (2 s cap per probe)--context <id>to scope to a single context;--jsonfor structured outputNew: lock file (
daemon/lifecycle.ts,shared/platform.ts,daemon/index.ts)/tmp/interceptor.lockon startup:{ pid, version, execPath, startedAt, socketPath, wsPort, mode }SIGINT/SIGTERM/SIGHUPcheckLockFileDuplicate()exits early with a clear error when a second live daemon is detected at startup — prevents the silent forkLOCK_PATHexported fromshared/platform.ts, overridable via$INTERCEPTOR_LOCK_PATHUpdated:
cli/index.ts,cli/help.tsdiagnosewired into routing, added toNO_DAEMONset (never auto-spawns)diagnose,diagnose --context <id>,diagnose --jsonNotes for reviewer
execPathusesprocess.execPath— the actual binary path, not argv[0], so symlinks resolve correctlyextension/src/inject-net.tsandtest/screenshot-minimized-preflight.test.tsare not introduced by this PR