Skip to content

feat: ship Slice K2 — chat-lane fail-fast on Claude Code auth errors#89

Merged
joedanz merged 1 commit into
mainfrom
slice/K2-chat-lane-auth-fail-fast
May 6, 2026
Merged

feat: ship Slice K2 — chat-lane fail-fast on Claude Code auth errors#89
joedanz merged 1 commit into
mainfrom
slice/K2-chat-lane-auth-fail-fast

Conversation

@joedanz
Copy link
Copy Markdown
Owner

@joedanz joedanz commented May 6, 2026

Summary

When the OAuth refresh token is revoked (HTTP 400 invalid_grant) or rejected (HTTP 401), the chat lane now fails fast within the first SDK roundtrip with an actionable re-auth reply instead of stalling ~5 minutes. K1 fixed the refresher's classification of these errors; K2 wires the structured failure all the way through to a user-visible Discord reply.

What changed

  • @mc/core/authisAuthRequiredError(err) helper + exported AUTH_REQUIRED_ERROR_CTORS const as the single source of truth for "user must re-authenticate" classification. Used by both sdk-source's rethrow branch and runner's failed-event tagging.
  • @mc/core/agent/sdk-source — broadens the rethrow-don't-retry clause to include RefreshTokenRevokedError. Previously a 400/invalid_grant slipped through to the transient-retry branch and looped until the access token expired.
  • @mc/core/agent/typesAgentEvent.failed.subtype is now AgentFailureSubtype (string union) instead of string. A typo at any callsite is a compile error.
  • @mc/core/agent/runner — catch block tags failed events with subtype: 'auth_required' when the thrown error is an auth-required class.
  • @mc/core/messaging/chat-bridge — exports CHAT_AUTH_REQUIRED_REPLY (Claude Code authentication expired. Run \claude login`, then `mc auth bootstrap` to restore access.); a private ChatAuthRequiredErrorcarries theauth_required` discriminator from the inner switch arm to the outer catch's reply branch.
  • @mc/core/orchestrator — prefixes failureMsg with auth_revoked: when the agent failure is auth_required. Dashboards / CLIs can match the prefix to render an actionable error.

Test plan

  • Gate 1 — pnpm typecheck clean
  • Gate 2 — pnpm build clean (all 7 workspaces)
  • Gate 3 — biome check . clean (388 files)
  • Gate 4 — pnpm test 1463/1463 passing
  • /simplify ran 3 reviewers in parallel — applied: tightened subtype to typed union (kills 4 stringly-typed sites), shared AUTH_REQUIRED_ERROR_CTORS const between refresher and runner.test.ts, exported CHAT_AUTH_REQUIRED_REPLY from messaging barrel
  • RED→GREEN tests added: 3 parametrized auth-error rethrow tests in sdk-source-401.test.ts, 3 parametrized subtype-tagging tests + 1 negative case in runner.test.ts, 1 friendly-reply test in chat-bridge.test.ts, 1 task-lane prefix test in orchestrator.test.ts
  • Manual smoke: with deliberately revoked token in ~/.mc/credentials.json, send a Discord chat message → reply within ~30s saying re-auth needed (not the generic 5-min stall)

Deferred

  • Structured errorKind field on TaskPatch (would require a schema migration; one writer, zero readers today).
  • Credentials-watch → refresher.invalidate() wiring so re-auth recovery doesn't need a daemon respawn.
  • K3 cabinet-style migration spike — separate slice.

When the OAuth refresh token is revoked or rejected, the chat lane now
fails fast within the first SDK roundtrip with an actionable re-auth
reply instead of stalling for ~5 minutes.

- packages/core/src/auth/refresher.ts: AUTH_REQUIRED_ERROR_CTORS const
  + isAuthRequiredError(err) helper as single source of truth for
  "user must re-authenticate" classification.
- packages/core/src/agent/sdk-source.ts: broadens rethrow-don't-retry
  to also catch RefreshTokenRevokedError (HTTP 400 invalid_grant). Was
  previously masked by the transient-retry branch.
- packages/core/src/agent/types.ts: failed.subtype is now narrowly
  typed as AgentFailureSubtype — typo at any callsite is a compile
  error, not a silent fallthrough.
- packages/core/src/agent/runner.ts: tags failed events with
  subtype 'auth_required' when isAuthRequiredError(err) matches.
- packages/core/src/messaging/chat-bridge.ts: routes auth_required
  failures to CHAT_AUTH_REQUIRED_REPLY ("run claude login then mc auth
  bootstrap") via a private ChatAuthRequiredError carrier class.
- packages/core/src/orchestrator/orchestrator.ts: prefixes failureMsg
  with auth_revoked: so dashboards can render an actionable error.
@joedanz joedanz merged commit d7055b8 into main May 6, 2026
3 checks passed
@joedanz joedanz deleted the slice/K2-chat-lane-auth-fail-fast branch May 6, 2026 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant