Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,34 @@ Each slice entry has:

## Released

### Slice K2 — chat-lane fail-fast on Claude Code auth errors · 2026-05-06 · PR (pending)

**Shipped**

- `packages/core/src/auth/refresher.ts` — exported `AUTH_REQUIRED_ERROR_CTORS` (the three error classes that mean "user must re-authenticate") and `isAuthRequiredError(err)` helper. Single source of truth for "is this an unrecoverable auth error" used by both sdk-source and the runner.
- `packages/core/src/agent/sdk-source.ts` — broadened the rethrow-don't-retry clause to also catch `RefreshTokenRevokedError` (HTTP 400 invalid_grant). Previously only `NotBootstrappedError` and `RefreshTokenRejectedError` (401) bypassed the transient-retry path, so a server-side revocation looped through the buffer-and-discard retry until the access token expired.
- `packages/core/src/agent/types.ts` — `failed` event's `subtype?` is now narrowly typed as `AgentFailureSubtype` (`'auth_required' | 'error_max_turns' | 'error_during_execution' | 'error_max_budget_usd' | 'error_max_structured_output_retries'`). A typo at any callsite is now a compile error, not a silent fallthrough to the generic error reply.
- `packages/core/src/agent/runner.ts` — catch block tags `failed` events with `subtype: 'auth_required'` when `isAuthRequiredError(err)` matches. Downstream lanes can branch on the structured tag.
- `packages/core/src/messaging/chat-bridge.ts` — exported `CHAT_AUTH_REQUIRED_REPLY` and added a private `ChatAuthRequiredError` carrier class. When the agent-event stream emits `failed` with `subtype: 'auth_required'`, the user receives `Claude Code authentication expired. Run \`claude login\`, then \`mc auth bootstrap\` to restore access.` instead of the generic `CHAT_ERROR_REPLY`.
- `packages/core/src/orchestrator/orchestrator.ts` — the task lane prefixes `failureMsg` with `auth_revoked: ` when the agent failure is `auth_required`. Dashboard and CLI consumers can match on the prefix to render an actionable error.
- `packages/core/src/messaging/index.ts` — re-exports `CHAT_AUTH_REQUIRED_REPLY` alongside the existing `CHAT_ERROR_REPLY` / `IN_FLIGHT_REPLY`.

**Why it matters**

K1 fixed the OAuth refresher's classification of HTTP 400 `invalid_grant` as a permanent failure, but the chat lane still hung on it: sdk-source's retry loop buffered messages and discarded them, then re-issued query() against the now-broken refresher; the `RefreshTokenRevokedError` wasn't in the rethrow list so it fell through to the transient-retry branch. With K1.5's Discord DM alerting, the user knows auth is broken — but if they sent a chat message before re-authenticating, the bot would silently hang for ~5 minutes before timing out. Now the failure is surfaced within the first SDK roundtrip with an actionable reply.

**Notable code**

- `refresher.ts:54–67` — `AUTH_REQUIRED_ERROR_CTORS` const + `isAuthRequiredError` helper. The const is also consumed by `runner.test.ts` for parametrized `it.each` coverage.
- `sdk-source.ts:115` — replaces a bespoke `instanceof X || instanceof Y` chain with the single helper call. Adding a fourth error class would require zero changes in this file.
- `runner.ts:42–47` — conditional spread (`...(isAuthRequiredError(err) && { subtype: 'auth_required' })`) keeps `exactOptionalPropertyTypes` happy and ensures non-auth errors don't carry the tag.
- `chat-bridge.ts:60–67, 302–305, 366` — the `ChatAuthRequiredError` discriminator hops the auth-tag from the inner switch arm to the outer catch's reply branch in one declarative throw, no boolean flag plumbing.

**Deferred**

- **Structured `errorKind` field on TaskPatch.** /simplify Quality reviewer flagged the `auth_revoked: ` string prefix as a leaky cross-layer contract. For this slice there is exactly one writer (orchestrator) and zero readers; lifting to a structured field would require a schema migration (the `tasks.error` column is persisted) and is over-engineering until the dashboard actually parses it. Revisit when the dashboard renders failed-task error states.
- **Re-auth recovery without daemon respawn.** When the user runs `mc auth bootstrap` after an `auth_required` failure, the running daemon's in-memory cached bundle is stale until the next config-watcher mtime poll triggers `triggerRestart()` → launchd respawn. K1's `invalidate()` method exists for a credentials-watch → invalidate wiring but is not yet wired; deferred to its own follow-up.

### Slice K1.5 — Discord DM on auth `'expiring'` / `'error'` (claudeclaw-os UX parity) · 2026-05-06 · PR (pending)

**Shipped**
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ graph TB
| `@mc/core/config` | `resolveConfig(store, registry)` — reads secrets and operational settings from SQLite (`integration_credentials` + `settings`), validates, returns deep-frozen `ResolvedConfig` with discriminated unions on `discord`, `memory.vec`, and `skills` (Slice J2: env-driven `MC_SKILLS_ROOT`, default `~/.claude/skills/`, validated for existence). `pbrainCli: boolean` and `neonCli: boolean` are sibling top-level fields set by `resolvePbrainCli()` and `resolveNeonCli()` respectively (both walk `$PATH` with `X_OK` checks via a private `isOnPath` helper). The Neon gate trusts the CLI's own auth cache (`neonctl auth` → `~/.config/neonctl/`, inherited via `$HOME` by the launchd-spawned daemon) — no parallel `NEON_API_KEY` env var required. Skills availability, pbrain CLI presence, and Neon CLI presence are independent capabilities. Invalid setting values surface via the `StatusRegistry` as `system` errors and fall back to defaults. |
| `@mc/core/git` | `simple-git` ops — branch fork before query, diff stat at completion. |
| `@mc/core/logger` | pino `makeLogger` — structured JSON, child loggers per module, redaction for secrets. |
| `@mc/core/messaging` | `MessagingAdapter` interface (`start({ onInbound, onCommand })`, `stop`, `send`, optional `react`) — every transport adapter implements it; Discord is the first concrete impl as of Slice H4a. `InboundDispatcher` — two-stage `accept(msg)` (sync-fast: single-statement `INSERT OR IGNORE` dedup, conversation upsert, fresh trace span, `inbound.accepted`/`inbound.deduped` event) → `process(msg, traceId)` (re-enters the trace and calls the injected `onProcess` handler — `makeProcessRouter` is the production handler as of Slice I1b). `resolveRouteForScope(channelId, cleanText, deps)` returns a discriminated `RouteResolution` (`explicit-task` / `chat-with-project` / `chat-no-project` / `rejected`); `makeProcessRouter({ orchestrator, chatBridge, resolveProject*, sendReply, adapterId })` dispatches by kind: explicit `work on … in <slug>` → `orchestrator.create()`, free-form text in routed/default-DM channels → chat-bridge with the route's project, free-form text in unrouted DMs → chat-bridge with no project, unknown explicit slug → `sendReply` with the unknown-project hint. `makeChatBridge` runs the chat lane against the same `AgentRunner` the orchestrator uses but with a brainstorming system prompt (read-only `Read`/`Glob`/`Grep` filesystem tools sandboxed via `additionalDirectories: [project.rootPath]`, optional `mc-memory` MCP server for the `memory_search` tool); inserts a `runs` row with NULL `task_id`, captures `started.sessionId` into `conversations.last_session_id`, strips the `<<READY_TO_PROMOTE>>` sentinel and emits `chat.ready_to_promote`, and on a thrown stream error emits `chat.failed` plus a user-visible reply. `makeRunCoordinator()` enforces single-run-per-conversation across both lanes. `makeDoCommandHandler` implements `/do` promotion (refuses on no-active-conversation, null lastSessionId, in-flight run, or unknown explicit slug; otherwise builds a `CreateTaskCommand` whose orchestrator path resumes the SDK session). `makeDispatcherHandlers(dispatcher, { onError })` is the shared handler shape compose + tests both wire. `command(cmd)` switches on `cmd.name`: `/new`/`/clear` close the active conversation and enqueue a `summarize_conversation` job (refused with `REPLY_RESET_REFUSED` when `isRunInFlight` says a run is mid-flight); `/do` calls the wired-in promotion handler and does NOT close the conversation. `stripPrivate(text)` removes `<private>...</private>` regions with depth-tracked nesting; `persistScopedMessage` is the shared assistant/user-row writer used by both lanes (single source of truth for the redaction policy). `trim(messages, budget)` — pure turn-group-atomic trimmer (drops whole atoms oldest-first, never orphans tool_use/tool_result pairs); ships locked-in for the future non-Claude provider slice. |
| `@mc/core/messaging` | `MessagingAdapter` interface (`start({ onInbound, onCommand })`, `stop`, `send`, optional `react`) — every transport adapter implements it; Discord is the first concrete impl as of Slice H4a. `InboundDispatcher` — two-stage `accept(msg)` (sync-fast: single-statement `INSERT OR IGNORE` dedup, conversation upsert, fresh trace span, `inbound.accepted`/`inbound.deduped` event) → `process(msg, traceId)` (re-enters the trace and calls the injected `onProcess` handler — `makeProcessRouter` is the production handler as of Slice I1b). `resolveRouteForScope(channelId, cleanText, deps)` returns a discriminated `RouteResolution` (`explicit-task` / `chat-with-project` / `chat-no-project` / `rejected`); `makeProcessRouter({ orchestrator, chatBridge, resolveProject*, sendReply, adapterId })` dispatches by kind: explicit `work on … in <slug>` → `orchestrator.create()`, free-form text in routed/default-DM channels → chat-bridge with the route's project, free-form text in unrouted DMs → chat-bridge with no project, unknown explicit slug → `sendReply` with the unknown-project hint. `makeChatBridge` runs the chat lane against the same `AgentRunner` the orchestrator uses but with a brainstorming system prompt (read-only `Read`/`Glob`/`Grep` filesystem tools sandboxed via `additionalDirectories: [project.rootPath]`, optional `mc-memory` MCP server for the `memory_search` tool); inserts a `runs` row with NULL `task_id`, captures `started.sessionId` into `conversations.last_session_id`, strips the `<<READY_TO_PROMOTE>>` sentinel and emits `chat.ready_to_promote`, and on a thrown stream error emits `chat.failed` plus a user-visible reply (Slice K2: when the agent failure carries `subtype: 'auth_required'` from the runner's auth-error tagging, the bridge sends `CHAT_AUTH_REQUIRED_REPLY` with the exact recovery commands instead of the generic `CHAT_ERROR_REPLY`). `makeRunCoordinator()` enforces single-run-per-conversation across both lanes. `makeDoCommandHandler` implements `/do` promotion (refuses on no-active-conversation, null lastSessionId, in-flight run, or unknown explicit slug; otherwise builds a `CreateTaskCommand` whose orchestrator path resumes the SDK session). `makeDispatcherHandlers(dispatcher, { onError })` is the shared handler shape compose + tests both wire. `command(cmd)` switches on `cmd.name`: `/new`/`/clear` close the active conversation and enqueue a `summarize_conversation` job (refused with `REPLY_RESET_REFUSED` when `isRunInFlight` says a run is mid-flight); `/do` calls the wired-in promotion handler and does NOT close the conversation. `stripPrivate(text)` removes `<private>...</private>` regions with depth-tracked nesting; `persistScopedMessage` is the shared assistant/user-row writer used by both lanes (single source of truth for the redaction policy). `trim(messages, budget)` — pure turn-group-atomic trimmer (drops whole atoms oldest-first, never orphans tool_use/tool_result pairs); ships locked-in for the future non-Claude provider slice. |
| `@mc/core/observability` | `withRootSpan(traceId, fn)` / `withSpan(fn)` / `withTraceSpan(fn)` — `AsyncLocalStorage`-backed trace propagation. `withTraceSpan` opens a child span when a parent context exists, otherwise a fresh root — used by the orchestrator so dispatcher-rooted runs join the inbound trace and CLI/scheduler runs auto-root. `EventBus.publish` is synchronous, so handlers see the publisher's trace context with zero per-listen plumbing. `EventWriter` (`info` / `warn` / `error` / `denied` / `compacted`) — façade over `AgentEventsRepo` that auto-pulls `traceId`/`spanId`/`parentSpanId` from the current context; throws if called outside any trace as a deliberate boundary. |
| `@mc/core/testing` | `makeOrchestratorIntegration`, `silentLogger`, `captureLogs` — real DataStore + real bus + faked SDK. |
| `@mc/memory` | `MemoryProvider` interface + `memoryProviderContract` shared spec runnable against any provider impl. |
Expand Down
29 changes: 29 additions & 0 deletions packages/core/src/agent/runner.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { mkdtempSync, rmSync, writeFileSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import { AUTH_REQUIRED_ERROR_CTORS } from '../auth/refresher.js';
import { makeAgentRunnerFromSource } from './runner.js';
import type { RawSdkMessage } from './sdk-stream-adapter.js';
import type { AgentEvent } from './types.js';
Expand Down Expand Up @@ -130,6 +131,34 @@ describe('makeAgentRunnerFromSource', () => {
});
});

it.each(AUTH_REQUIRED_ERROR_CTORS)('tags %s as failed with subtype: auth_required', async (ErrorCtor) => {
async function* failing(): AsyncIterable<RawSdkMessage> {
yield initMsg;
throw new ErrorCtor('hi');
}
const runner = makeAgentRunnerFromSource(() => failing());
const evs = await collect(runner.run({ runId: 'r1', cwd: '/tmp', prompt: 'hi', model: MODEL }));
const last = evs.at(-1);
expect(last?.type).toBe('failed');
if (last?.type === 'failed') {
expect(last.subtype).toBe('auth_required');
}
});

it('non-auth errors do NOT carry subtype: auth_required', async () => {
async function* failing(): AsyncIterable<RawSdkMessage> {
yield initMsg;
throw new Error('something else');
}
const runner = makeAgentRunnerFromSource(() => failing());
const evs = await collect(runner.run({ runId: 'r1', cwd: '/tmp', prompt: 'hi', model: MODEL }));
const last = evs.at(-1);
expect(last?.type).toBe('failed');
if (last?.type === 'failed') {
expect(last.subtype).toBeUndefined();
}
});

it('emits cancelled when abortSignal fires before completion', async () => {
const ac = new AbortController();
async function* slow(): AsyncIterable<RawSdkMessage> {
Expand Down
8 changes: 7 additions & 1 deletion packages/core/src/agent/runner.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import { isAuthRequiredError } from '../auth/refresher.js';
import { readJsonlMessages } from './replay.js';
import { type RawSdkMessage, SdkStreamAdapter } from './sdk-stream-adapter.js';
import type { AgentEvent, AgentRunInput, AgentRunner } from './types.js';
Expand Down Expand Up @@ -37,10 +38,15 @@ export function makeAgentRunnerFromSource(factory: RawSourceFactory): AgentRunne
for (const ev of adapter.translate(raw)) yield ev;
}
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
// Auth errors propagated from sdk-source → refresher must be tagged so
// downstream lanes (chat-bridge, orchestrator) can surface a re-auth
// prompt instead of a generic "something went wrong" reply.
yield {
type: 'failed',
runId: input.runId,
error: err instanceof Error ? err.message : String(err),
error: message,
...(isAuthRequiredError(err) && { subtype: 'auth_required' }),
};
}
},
Expand Down
32 changes: 32 additions & 0 deletions packages/core/src/agent/sdk-source-401.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -98,4 +98,36 @@ describe('makeSdkSourceFactory — 401 retry', () => {
expect(queryMock).toHaveBeenCalledTimes(1);
expect(out.find((m) => (m as { type: string; is_error?: boolean }).is_error === true)).toBeDefined();
});

// Resolve the ctor list inside each test because beforeEach calls
// vi.resetModules() — the freshly loaded sdk-source binds against a fresh
// refresher module, so any statically-imported class identity here would
// miss the instanceof check.
it.each([
'NotBootstrappedError',
'RefreshTokenRejectedError',
'RefreshTokenRevokedError',
])('rethrows %s from refresher.ensureFresh — never retries', async (name) => {
queryMock.mockReturnValue(
messages({ type: 'result', subtype: 'success', is_error: true, api_error_status: 401 }),
);
const auth = await import('../auth/refresher.js');
const ErrorCtor = (auth as unknown as Record<string, new (detail?: string) => Error>)[name];
const refresher = refresherStub();
refresher.ensureFresh = vi.fn(async () => {
throw new ErrorCtor('hi');
});
const { makeSdkSourceFactory } = await import('./sdk-source.js');
const factory = makeSdkSourceFactory({ refresher });
let caught: Error | undefined;
try {
for await (const _ of factory({ prompt: 'hi', cwd: '/tmp' } as never)) {
// unused
}
} catch (err) {
caught = err as Error;
}
expect(caught).toBeInstanceOf(ErrorCtor);
expect(queryMock).toHaveBeenCalledTimes(1);
});
});
11 changes: 6 additions & 5 deletions packages/core/src/agent/sdk-source.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { query } from '@anthropic-ai/claude-agent-sdk';
import { NotBootstrappedError, type OAuthRefresher, RefreshTokenRejectedError } from '../auth/refresher.js';
import { isAuthRequiredError, type OAuthRefresher } from '../auth/refresher.js';
import type { RecordingSink } from './replay.js';
import type { RawSourceFactory } from './runner.js';
import type { RawSdkMessage } from './sdk-stream-adapter.js';
Expand Down Expand Up @@ -109,10 +109,11 @@ export function makeSdkSourceFactory(opts: MakeSdkSourceFactoryOpts = {}): RawSo
try {
await opts.refresher.ensureFresh();
} catch (err) {
// Non-transient errors (NotBootstrappedError, RefreshTokenRejectedError)
// mean the user MUST act — surface them directly so the agent runner
// can produce a meaningful error rather than another opaque 401.
if (err instanceof NotBootstrappedError || err instanceof RefreshTokenRejectedError) {
// Non-transient auth errors mean the user MUST act — surface them
// directly so the agent runner can produce a meaningful error rather
// than another opaque 401. (HTTP 400/`invalid_grant` → revoked,
// HTTP 401 → rejected, missing creds → not bootstrapped.)
if (isAuthRequiredError(err)) {
throw err;
}
// Transient (network, 5xx): fall through, mark attempted, and let the
Expand Down
20 changes: 19 additions & 1 deletion packages/core/src/agent/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,24 @@

import type { ConversationScope } from '../types/index.js';

/**
* Discriminator on the `failed` event used by downstream consumers (chat-bridge,
* orchestrator, dashboard) to render an actionable error instead of the generic
* "something went wrong" path. Keep this as a string-union so a typo at any
* call-site is a compile error.
*
* `auth_required` is emitted by the runner when sdk-source surfaces an
* unrecoverable auth error (revoked / rejected refresh token, missing creds).
* The other members are passthroughs from the SDK's `result` error message
* subtypes — see `sdk-stream-adapter.ts`'s `RawSdkResultError`.
*/
export type AgentFailureSubtype =
| 'auth_required'
| 'error_max_turns'
| 'error_during_execution'
| 'error_max_budget_usd'
| 'error_max_structured_output_retries';

export type AgentEvent =
| { type: 'started'; runId: string; sessionId: string; model: string }
| { type: 'assistant_text_delta'; runId: string; turn: number; text: string }
Expand Down Expand Up @@ -34,7 +52,7 @@ export type AgentEvent =
usage: TokenUsage;
costUsd?: number;
}
| { type: 'failed'; runId: string; error: string; subtype?: string }
| { type: 'failed'; runId: string; error: string; subtype?: AgentFailureSubtype }
| { type: 'cancelled'; runId: string };

export interface TokenUsage {
Expand Down
Loading
Loading