feat: ship Slice K1 — auth bootstrap fix + invalid_grant + expiring status#87
Merged
Conversation
…n + expiring status
Three concrete bugs were keeping `claude login` + `mc auth bootstrap` from being a
reliable recovery path:
1. Silent no-op when ~/.mc/credentials.json existed: bootstrap printed "Already
bootstrapped" and never overwrote, so the stale token stayed in place. Now
always overwrites and prints the old → new expiry diff. `--dry-run` still
works and shows the same diff without writing.
2. HTTP 400 `invalid_grant` was treated as transient: the !response.ok branch
silently returned the cached token if still valid, hiding permanent
refresh-token revocation until the access token expired and surfaced as a
generic "network error". New `RefreshTokenRevokedError` throws immediately
on 400 with `error: invalid_grant` (parsed from JSON, not regex — narrower
surface). HTTP 401 keeps throwing `RefreshTokenRejectedError` unchanged.
3. No proactive heads-up before expiry: operators only learned auth was about
to break when the chat lane hung. New `'expiring'` `RefresherStatus`
emitted when cached token has under 2h remaining (claudeclaw-os pattern).
Daemon maps it to `statusRegistry.setUnverified('claudeCode', ...)` plus a
WARN log.
Also exposes `OAuthRefresher.invalidate()` for future credentials-watch wiring
(deferred to its own slice — the existing config-watcher → respawn path still
recovers correctly today).
Notable code:
- `refresher.ts:33–45` — `RefreshTokenRevokedError` class
- `refresher.ts:181–193` — `isInvalidGrantBody()` JSON parse
- `refresher.ts:128–139` — single `clock()` snapshot for the fast-path
expiring/ready emission
- `auth.ts:73–98` — `printBootstrapResult()` flattens dry-run/write print paths
- `mcd-main.ts:312–332` — switch with exhaustiveness guard for RefresherStatus
Test status: 1452/1452 across 125 files, all four gates green
(typecheck/build/biome/vitest). 5 new tests in refresher.test.ts cover the new
error class, invalidate(), and the 'expiring' status emission. auth.test.ts
rewritten to assert the new always-overwrite + diff output behavior.
Deferred:
- K1.5 — `DiscordClient.sendDirectMessage(userId, content)` so the proactive
expiry alert reaches the owner directly (claudeclaw-os does this via
Telegram; MC's Discord adapter currently only has channel send).
- K3 — research spike on cabinet-style migration (delete the entire
packages/core/src/auth/ tree and shell out to `claude` CLI). Both
hilash/cabinet and earlyaidopters/claudeclaw-os reviewed; both delete
100% of MC's OAuth refresh code.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three bugs were keeping
claude login+mc auth bootstrapfrom being a reliable recovery path. K1 fixes all three and adds a proactive expiry signal.mc auth bootstrapalways overwrites by default and prints old → new expiry diff. Earlier silent-skip-when-file-exists was a footgun: users saw "Already bootstrapped" and assumed the rotation applied — it didn't, the stale token stayed in place.--dry-runstill works and shows the same diff without writing.invalid_grantnow throwsRefreshTokenRevokedErrorimmediately instead of falling through the generic non-200 path that silently returned the cached token. Anthropic uses 400 for permanent revocation; the previous code masked it as a transient "network error" until the access token expired 5–10 min later. JSON-parsed (not regex) for a narrow surface — other 400 bodies (e.g.invalid_request) still hit the existing transient branch.'expiring'RefresherStatusvalue (claudeclaw-os pattern). Daemon maps it tostatusRegistry.setUnverified('claudeCode', …)+ a WARN log so it shows up in~/.mc/logs/before the chat lane hangs.OAuthRefresher.invalidate()exposed for future credentials-watch wiring (deferred to its own slice — existing config-watcher → respawn path still recovers correctly today).Why this shape
error: invalid_grantanywhere (nested fields, error descriptions). The endpoint returns well-formed JSON; matching the success path's parse approach eliminates the false-positive surface entirely.'expiring'tosetUnverified, not a new dashboard status. "Still works, but act now" is exactly the existing'unverified'semantic. No type-system or render-path changes needed;switchwith explicitcase 'disabled':plus_exhaustive: neverguards future status additions at compile time.clock()snapshot in the fast path. Efficiency reviewer noted the original calledclock()twice in the cached-fresh branch — a stale-snapshot risk under non-injected clocks for what should be the same value.--forceflag. Dead now that overwrite is the default. Quality reviewer flagged the deadforce?field onRunAuthBootstrapOpts.flags.~/.mc/credentials.jsonmtime changes;invalidate()would skip the respawn round-trip. Held back because (a) the respawn path works today, (b) the watcher's one-shot semantics need a small redesign to support both "DB change → respawn" and "credentials change → invalidate", and (c) the user's reported bug is fully addressed by bugs 1 + 2.Prior art reviewed
hilash/cabinet— shells out to theclaudeCLI binary, lets the CLI own auth/refresh/keychain. Zero OAuth code.earlyaidopters/claudeclaw-os— reads~/.claude/.credentials.jsonfor monitoring only (~120 lines), never refreshes; sends a Telegram DM at <2h to expiry. K1's'expiring'status mirrors this UX.K3 (delete
packages/core/src/auth/entirely and adopt cabinet's pattern) is on the followups — research spike planned next week.Test status
refresher.test.ts(RefreshTokenRevokedErroron 400 invalid_grant; fall-through on 400 other;invalidate();'expiring'emission;EXPIRING_THRESHOLD_MSexported value)auth.test.tsrewritten: silent-skip /--forcecases replaced with always-overwrite + diff assertions/simplifyran with 3 parallel reviewers (Reuse / Quality / Efficiency); 5 real findings, all addressedTest plan
vitest run packages/core/src/auth/refresher.test.ts)vitest run packages/cli/src/commands/auth.test.ts)pnpm test) — 1452/1452RefreshTokenRevokedErrorat boot probe with the actionable "run claude login then mc auth bootstrap" message (verified via existing test fixtures)mc auth bootstrapwith existing creds printsReplaced … Old expiry: … New expiry: …linesDeferred / additive
DiscordClient.sendDirectMessage(userId, content)— claudeclaw-os does this via Telegram; MC's Discord adapter currently only hassendInChannel(channelId, content). Adding the DM-by-userId surface is its own small slice (~30 lines + tests). Wires the proactive'expiring'alert directly to Joe's DM.RefreshTokenRejectedErrorand now alsoRefreshTokenRevokedError, but the chat-bridge doesn't yet produce a Discord-friendly "re-auth needed" reply. Holding for K2.packages/core/src/auth/entirely. Research spike planned: verify whether the Claude Agent SDK can fall back to theclaudeCLI's auth state whenCLAUDE_CODE_OAUTH_TOKENis unset. If yes, delete; if no, weigh subprocess approach.invalidate()wiring — split out as its own slice for the reasons above.