Fix provider reconnect to open the OAuth flow instead of failing by Th0rgal · Pull Request #472 · Th0rgal/sandboxed.sh

Th0rgal · 2026-05-30T06:34:07Z

Problem

On settings/providers, clicking Reconnect for an OAuth provider (xAI/Grok, Anthropic) with an expired/revoked token produced:

Re-authenticated, but provider check still fails: xAI OAuth token expired; reconnect Grok Build

instead of opening the OAuth link.

Root cause

The Reconnect button calls POST /api/ai/providers/:id/auth, whose only check is has_credentials() — which returns true whenever an oauth blob merely exists, even if expired/revoked. So the endpoint returned {success: true, auth_url: null}, the frontend skipped opening the link, ran a live usage probe, and surfaced the probe's failure.

Fix

Frontend

New ReconnectProviderModal that drives the real oauthAuthorize → confirm-code → oauthCallback flow already used by the add-provider modal.
OAuth-backed providers (uses_oauth && !has_api_key) route to it; API-key providers keep the legacy path.
Method indices are pinned to the backend ProviderType::auth_methods() ordering (Anthropic Pro/Max vs console mode resolves correctly); single-method providers (xAI) auto-start.
Post-auth health probe factored into a shared helper.

Backend

oauth_callback updates the existing provider in place when the path id is a known UUID (what Reconnect sends), instead of always inserting a new row — prevents duplicate provider entries on reconnect. The add-provider flow passes a type id (not a UUID) and still falls through to add().

Testing

Deployed to the dev backend and verified against the live xAI provider stuck in needs_reauth:

Path	Endpoint	Result
Old (bug)	`POST /:id/auth`	`{"success":true,...,"auth_url":null}`
New (fix)	`POST /:id/oauth/authorize`	xAI → `accounts.x.ai/oauth2/device?user_code=…`; Anthropic → `claude.ai/oauth/authorize` (Pro/Max) and `console.anthropic.com/oauth/authorize` (API key)

cargo check + cargo fmt --all --check clean; backend builds on Linux, deploys to dev, service healthy.
tsc --noEmit + eslint clean; full Next.js production build passes with /settings/providers present.
No OAuth callback was completed during testing, so no credentials/provider state changed.

Note: the literal button click and the duplicate-row dedup end-to-end were not automated (browser auth gate / would mint real tokens).

Note

Medium Risk
Touches OAuth credential persistence and in-place provider updates (auth-critical), plus Anthropic request rewriting and session reset behavior on the inference path.

Overview
Reconnect on the providers settings page now runs the real OAuth authorize → callback path for OAuth-only providers (xAI, Anthropic, etc.) instead of POST …/auth, which treated expired tokens as “authenticated” and never opened the auth link. A dedicated ReconnectProviderModal mirrors the add-provider OAuth UX (method indices aligned with the backend); post-reconnect health probing is shared via probeProviderHealth.

On the backend, OAuth callback accepts the provider UUID from reconnect and updates that row in place (including xAI Grok upsert by target_id), avoiding duplicate provider rows. API-key reconnect still uses the legacy auth endpoint.

Separately, the Anthropic proxy and mission runner gain handling for stale extended-thinking blocks when the model changes or blocks are replayed: strip thinking on model rewrite, preserve thinking in the OpenAI→Anthropic adapter, one-shot retry with thinking disabled after the specific 400, and Claude Code transport recovery that resets to a fresh session instead of resuming when that error appears in turn output.

^{Reviewed by Cursor Bugbot for commit 51e62f1. Bugbot is set up for automated code reviews on this repo. Configure here.}

The Reconnect button on settings/providers called POST /:id/auth, whose only check is has_credentials() — which is true whenever an oauth blob merely exists, even if the token is expired or revoked. For OAuth providers (xAI/Grok, Anthropic) this returned success without an auth_url, so the frontend skipped the OAuth link, ran a live usage probe, and surfaced "Re-authenticated, but provider check still fails: xAI OAuth token expired…". Frontend: route OAuth-backed providers (uses_oauth && !has_api_key) to a new ReconnectProviderModal that drives the real oauthAuthorize -> confirm-code -> oauthCallback flow already used by the add-provider modal. Method indices are pinned to ProviderType::auth_methods() so Anthropic's Pro/Max vs console mode resolves correctly; single-method providers (xAI) auto-start. API-key providers keep the legacy path. The post-auth health probe is factored into a shared helper. Backend: oauth_callback now updates the existing provider in place when the path id is a known UUID (what Reconnect sends) instead of always inserting a new row, preventing duplicate provider entries on reconnect. The add-provider flow passes a type id (not a UUID) and still falls through to add().

vercel · 2026-05-30T06:34:12Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
sandboxed-dashboard	Ready	Preview, Comment	May 30, 2026 2:03pm
sandboxed-sh	Ready	Preview, Comment	May 30, 2026 2:03pm

Anthropic binds `thinking`/`redacted_thinking` signatures to the exact model that produced them. The proxy rewrites `model` on every forwarded request (fallback chains, default-model changes), so continuing a conversation after the model changed replayed thinking blocks signed by the old model — Anthropic rejected it with: "`thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response." (Surfaced after switching the default to claude-opus-4-8 while missions started under opus-4-6/4-7 were resumed.) Fixes: - add strip_thinking_blocks(): drop thinking/redacted_thinking from assistant turns, never producing an empty content array - rewrite_model_for_anthropic_cli_proxy: when the rewritten model differs from the original request model, strip stale thinking before forwarding - build_anthropic_upstream_request (OpenAI->Anthropic adapter): same model-change strip - anthropic_content_blocks_from_openai: preserve thinking/redacted_thinking (text + signature) instead of silently dropping them, so same-model replays keep working Adds unit tests for strip-on-change, keep-on-same-model, and block preservation.

Th0rgal · 2026-05-30T06:42:17Z

Added commit 05e58c34 (proxy: strip stale thinking/redacted_thinking blocks when the request model changes) onto this branch at Thomas's request, so it ships together with the provider-reconnect fix. It's an isolated change to src/api/proxy.rs only (no overlap with the OAuth-reconnect files). Fixes the Anthropic 400 "thinking blocks ... cannot be modified" that surfaced when resuming missions after the default model switched to claude-opus-4-8. Deployed to prod from this branch (commit 05e58c3).

- proxy: strip_thinking_blocks now drops thinking from a thinking-only assistant turn too, substituting a placeholder text block (the previous guard left stale cross-model thinking on such turns -> Anthropic 400) - ai_providers: oauth_callback resolves the provider type via the store when reconnect passes a UUID, so the row's credentials are actually refreshed instead of keeping expired tokens - reconnect modal: guard oauthAuthorize against stale/late responses via a monotonic request token (close/switch supersedes in-flight requests) - reconnect modal: drop the premature success toast; handleReconnectSuccess now owns the success/failure message after the usage probe, so users no longer see "reconnected" + "check still fails" for one action Adds a proxy unit test for the thinking-only model-switch case.

The model-rewrite strip only covers in-request model changes, but missions can carry thinking blocks in stored history that were produced under an earlier model while the current request already matches the chain model. Those replays still get rejected by Anthropic with "thinking ... blocks ... cannot be modified". Add a reactive recovery in the proxy chain loop: on a 400 from an Anthropic adapter (OAuth CLI-proxy or direct), if the error body is the stale-thinking rejection, strip all thinking/redacted_thinking from the request, disable extended thinking for that turn, and retry once against the same upstream. Non-thinking 400s and non-Anthropic providers are unaffected. - anthropic_error_is_stale_thinking(): classify the 400 body - anthropic_body_drop_thinking_and_disable(): strip + set thinking disabled - guarded inline retry in the chain loop (mutable upstream_resp/status) - unit tests for detection and strip/disable

Claude Code's LLM calls go through the external cli-proxy, so the proxy-side thinking strip/retry never sees them. When a resumed claudecode mission replays a session transcript whose thinking blocks were signed under a different model, Anthropic returns 400 "thinking ... cannot be modified" and the mission hard-fails. Route that error into the existing ResetSessionFresh transport-recovery path: - is_stale_thinking_error(): detect the rejection in the turn output - claudecode_transport_recovery_strategy: on stale-thinking, escalate straight to a fresh session (skip same-session resume, which would replay the same rejected blocks); the existing reset path rebuilds context as text and drops the signed thinking, so the turn succeeds. Adds a unit test.

- oauth_callback (UUID reconnect): don't clobber stored api_key/oauth with None when the callback produced no fresh credentials (e.g. a failed auth.json sync that still reported success) — only replace when fresh creds were actually extracted. - oauth_callback (UUID reconnect): never fall through to `add` when an existing UUID was targeted; a missing row or failed update now returns an explicit 404/500 instead of inserting a duplicate account for the same OAuth completion. - upsert_grok_oauth_provider: accept a target_id and prefer that row, so an xAI reconnect updates the clicked row (which the health probe checks) instead of the first enabled OAuth xAI account.

…reconnect

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 51e62f1. Configure here.}

cursor · 2026-05-30T14:05:19Z

+                        chain_length,
+                    });
+                    client_error_count += 1;
+                    continue;


CLI proxy 400s trigger cooldown

Medium Severity

For Anthropic OAuth CLI-proxy routing, any HTTP 400 that is not classified as stale-thinking now calls record_entry_failure with ClientError and skips the rest of the chain entry. That path used to fall through to the generic 4xx handler, which intentionally avoids cooldowns. Unrelated validation 400s can temporarily sideline otherwise healthy OAuth accounts.

^{Reviewed by Cursor Bugbot for commit 51e62f1. Configure here.}

cursor · 2026-05-30T14:05:20Z

+                } else {
+                    build_anthropic_upstream_request(&body, &entry.model_id, is_stream)
+                };
+                base.and_then(|b| anthropic_body_drop_thinking_and_disable(&b))


CLI proxy retry wrong body format

Medium Severity

On a stale-thinking HTTP 400, the OAuth CLI-proxy branch builds a retry from rewrite_model_for_anthropic_cli_proxy (OpenAI /v1/chat/completions JSON) but then passes it through anthropic_body_drop_thinking_and_disable, which injects Anthropic Messages API fields such as top-level thinking. That retry is posted back to the CLI proxy, so the recovery path for Anthropic OAuth CLI routing cannot reliably fix stale thinking blocks.

^{Reviewed by Cursor Bugbot for commit 51e62f1. Configure here.}

vercel Bot deployed to Preview – sandboxed-sh May 30, 2026 06:34 View deployment

vercel Bot deployed to Preview – sandboxed-dashboard May 30, 2026 06:34 View deployment

cursor Bot reviewed May 30, 2026

View reviewed changes

Comment thread src/api/ai_providers.rs

Comment thread dashboard/src/components/ui/reconnect-provider-modal.tsx

Comment thread dashboard/src/components/ui/reconnect-provider-modal.tsx

vercel Bot deployed to Preview – sandboxed-sh May 30, 2026 06:40 View deployment

vercel Bot deployed to Preview – sandboxed-dashboard May 30, 2026 06:40 View deployment

cursor Bot reviewed May 30, 2026

View reviewed changes

Comment thread src/api/proxy.rs

vercel Bot deployed to Preview – sandboxed-sh May 30, 2026 06:50 View deployment

vercel Bot deployed to Preview – sandboxed-dashboard May 30, 2026 06:50 View deployment

cursor Bot reviewed May 30, 2026

View reviewed changes

Comment thread src/api/ai_providers.rs

Comment thread src/api/proxy.rs

vercel Bot deployed to Preview – sandboxed-sh May 30, 2026 07:12 View deployment

vercel Bot deployed to Preview – sandboxed-dashboard May 30, 2026 07:12 View deployment

vercel Bot deployed to Preview – sandboxed-sh May 30, 2026 07:23 View deployment

vercel Bot deployed to Preview – sandboxed-dashboard May 30, 2026 07:24 View deployment

cursor Bot reviewed May 30, 2026

View reviewed changes

Comment thread src/api/ai_providers.rs

Comment thread src/api/ai_providers.rs

vercel Bot deployed to Preview – sandboxed-sh May 30, 2026 08:15 View deployment

vercel Bot deployed to Preview – sandboxed-dashboard May 30, 2026 08:15 View deployment

Th0rgal added 2 commits May 30, 2026 15:01

Merge remote-tracking branch 'origin/master' into fix-provider-oauth-…

7a5ad0f

…reconnect

cargo fmt after merge (system.rs adopt_hermes_assistant)

51e62f1

Th0rgal merged commit f41c069 into master May 30, 2026
7 of 10 checks passed

vercel Bot deployed to Preview – sandboxed-sh May 30, 2026 14:02 View deployment

vercel Bot deployed to Preview – sandboxed-dashboard May 30, 2026 14:03 View deployment

cursor Bot reviewed May 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix provider reconnect to open the OAuth flow instead of failing#472

Fix provider reconnect to open the OAuth flow instead of failing#472
Th0rgal merged 8 commits into
masterfrom
fix-provider-oauth-reconnect

Th0rgal commented May 30, 2026 •

edited by cursor Bot

Loading

Uh oh!

vercel Bot commented May 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Th0rgal commented May 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 30, 2026

Uh oh!

cursor Bot May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Th0rgal commented May 30, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root cause

Fix

Testing

Uh oh!

vercel Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Th0rgal commented May 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 30, 2026

Choose a reason for hiding this comment

CLI proxy 400s trigger cooldown

Uh oh!

cursor Bot May 30, 2026

Choose a reason for hiding this comment

CLI proxy retry wrong body format

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Th0rgal commented May 30, 2026 •

edited by cursor Bot

Loading

vercel Bot commented May 30, 2026 •

edited

Loading