Skip to content

fix(claude): skip redundant credential rewrites on every Claude spawn (closes #1507)#1529

Merged
nwparker merged 3 commits intomainfrom
nwparker/fix-claude-credentials-eperm-1507
May 7, 2026
Merged

fix(claude): skip redundant credential rewrites on every Claude spawn (closes #1507)#1529
nwparker merged 3 commits intomainfrom
nwparker/fix-claude-credentials-eperm-1507

Conversation

@nwparker
Copy link
Copy Markdown
Contributor

@nwparker nwparker commented May 7, 2026

Problem

On Windows, starting a worktree from a Linear issue (or any flow that spawns Claude while a managed account is active) intermittently fails with:

Error invoking remote method 'pty:spawn': EPERM: operation not permitted, rename
'C:\Users\<u>\.claude\.credentials.json.<pid>.<uuid>.tmp' -> 'C:\Users\<u>\.claude\.credentials.json'

Reported in #1507 (screenshot in the issue).

Root cause

Every Claude pty spawn calls prepareForClaudeLaunchsyncForCurrentSelection, which unconditionally atomically rewrites ~/.claude/.credentials.json and the oauthAccount field of ~/.claude.json — even when the contents haven't changed since the last sync. On Windows the rename step of that atomic write intermittently fails with EPERM/EBUSY when a sibling Claude CLI process or AV briefly holds the destination open.

The existing renameWithRetry (3 attempts, ~150 ms total) was both too short for typical AV scan windows and used a Date.now() spin loop that pegged a CPU core on the Electron main thread during backoff (which is also where IPC runs).

Fix

Two complementary, focused changes:

  1. writeRuntimeCredentials / writeJson are now no-ops when the content is unchanged (src/main/claude-accounts/runtime-auth-service.ts). In-memory cache check first (covers the steady-state spawn loop, zero I/O), on-disk fallback covers the first sync after app restart. The credentials file only actually changes on token refresh / account switch, so the hot path no longer touches disk and the contention window disappears entirely.

  2. renameWithRetry is hardened (src/main/codex-accounts/fs-utils.ts):

    • 6 attempts on Windows (~750 ms total backoff) instead of 3, covering typical AV scan windows.
    • Also retries on EBUSY (not just EPERM/EACCES).
    • Uses Atomics.wait on a private SharedArrayBuffer for a real OS-level thread sleep, instead of a busy-wait. The Electron main thread now actually yields during backoff and IPC stays responsive.

The first change is the real fix — it makes the contention impossible on the hot path. The second is defense-in-depth for cases that legitimately need to rewrite the file (token refresh, account switch).

Why this is the elegant version

  • Do less work. The bug isn't "rename is unreliable on Windows", it's "we're rewriting an unchanged file on every spawn". Fixing the redundancy fixes the root cause; the retry hardening is just a backstop.
  • No new abstractions or moving parts. Reuses the existing lastWrittenCredentialsJson cache and the existing atomic-write helper.
  • No semantic change to the credential lifecycle. readBackRefreshedTokens still runs first, so an externally-refreshed token is still captured before the no-op check; if the managed credentials genuinely differ from disk, we still write.
  • Main thread stays responsive. Replacing the spin loop with Atomics.wait is a small, surgical correctness fix to a sync API that has to remain sync.

Testing

  • npx tsc --noEmit — clean.
  • npx vitest run src/main/codex-accounts/fs-utils.test.ts — 4/4 pass.
  • Manual trace of the call graph confirms prepareForClaudeLaunchsyncForCurrentSelectionwriteRuntimeCredentials is the path that hits the EPERM in the issue screenshot, and that the in-memory short-circuit eliminates the file write on every spawn after the first.

A heavier integration test for ClaudeRuntimeAuthService (mocking electron app, Store, and the keychain helpers) would be valuable but is out of scope here — leaving as a follow-up.

Risk

Low. The new code paths are strictly narrower than before (they skip a write when the destination already matches), the retry change only widens an existing retry, and the sleep change is a same-semantics replacement of a CPU-spin with a thread-park.

Closes #1507.

Made with Orca 🐋

nwparker and others added 3 commits May 6, 2026 23:16
… rename retry

Closes #1507

On Windows, starting a worktree could fail with:
  Error invoking remote method 'pty:spawn': EPERM: operation not permitted,
  rename '...\.credentials.json.<pid>.<uuid>.tmp' -> '...\.credentials.json'

Root cause: every Claude pty spawn calls prepareForClaudeLaunch ->
syncForCurrentSelection, which unconditionally atomically rewrites
~/.claude/.credentials.json (and the oauthAccount field of ~/.claude.json).
On Windows the rename step intermittently fails with EPERM/EBUSY when a
sibling Claude CLI process or AV briefly holds the destination open. The
existing retry (3 attempts, ~150 ms total, busy-wait) is both too short
and pegs a CPU core on the Electron main thread.

Fixes:

  1. writeRuntimeCredentials / writeJson now no-op when the target content
     is already identical (in-memory check first, on-disk check as fallback
     to handle the first sync after app restart). The credentials file
     only actually changes on token refresh / account switch, so the hot
     path no longer touches disk and the contention window disappears.

  2. renameWithRetry now retries 6 times on Windows (~750 ms total),
     also handles EBUSY, and uses Atomics.wait on a SharedArrayBuffer for
     a real OS-level thread sleep instead of a Date.now() spin loop \u2014 so
     IPC and the renderer aren't frozen during backoff.

Co-authored-by: Orca <help@stably.ai>
@nwparker nwparker merged commit b7a583d into main May 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant