Fix provider reconnect to open the OAuth flow instead of failing#472
Conversation
The Reconnect button on settings/providers called POST /:id/auth, whose only check is has_credentials() — which is true whenever an oauth blob merely exists, even if the token is expired or revoked. For OAuth providers (xAI/Grok, Anthropic) this returned success without an auth_url, so the frontend skipped the OAuth link, ran a live usage probe, and surfaced "Re-authenticated, but provider check still fails: xAI OAuth token expired…". Frontend: route OAuth-backed providers (uses_oauth && !has_api_key) to a new ReconnectProviderModal that drives the real oauthAuthorize -> confirm-code -> oauthCallback flow already used by the add-provider modal. Method indices are pinned to ProviderType::auth_methods() so Anthropic's Pro/Max vs console mode resolves correctly; single-method providers (xAI) auto-start. API-key providers keep the legacy path. The post-auth health probe is factored into a shared helper. Backend: oauth_callback now updates the existing provider in place when the path id is a known UUID (what Reconnect sends) instead of always inserting a new row, preventing duplicate provider entries on reconnect. The add-provider flow passes a type id (not a UUID) and still falls through to add().
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Anthropic binds `thinking`/`redacted_thinking` signatures to the exact model that produced them. The proxy rewrites `model` on every forwarded request (fallback chains, default-model changes), so continuing a conversation after the model changed replayed thinking blocks signed by the old model — Anthropic rejected it with: "`thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response." (Surfaced after switching the default to claude-opus-4-8 while missions started under opus-4-6/4-7 were resumed.) Fixes: - add strip_thinking_blocks(): drop thinking/redacted_thinking from assistant turns, never producing an empty content array - rewrite_model_for_anthropic_cli_proxy: when the rewritten model differs from the original request model, strip stale thinking before forwarding - build_anthropic_upstream_request (OpenAI->Anthropic adapter): same model-change strip - anthropic_content_blocks_from_openai: preserve thinking/redacted_thinking (text + signature) instead of silently dropping them, so same-model replays keep working Adds unit tests for strip-on-change, keep-on-same-model, and block preservation.
|
Added commit |
- proxy: strip_thinking_blocks now drops thinking from a thinking-only assistant turn too, substituting a placeholder text block (the previous guard left stale cross-model thinking on such turns -> Anthropic 400) - ai_providers: oauth_callback resolves the provider type via the store when reconnect passes a UUID, so the row's credentials are actually refreshed instead of keeping expired tokens - reconnect modal: guard oauthAuthorize against stale/late responses via a monotonic request token (close/switch supersedes in-flight requests) - reconnect modal: drop the premature success toast; handleReconnectSuccess now owns the success/failure message after the usage probe, so users no longer see "reconnected" + "check still fails" for one action Adds a proxy unit test for the thinking-only model-switch case.
The model-rewrite strip only covers in-request model changes, but missions can carry thinking blocks in stored history that were produced under an earlier model while the current request already matches the chain model. Those replays still get rejected by Anthropic with "thinking ... blocks ... cannot be modified". Add a reactive recovery in the proxy chain loop: on a 400 from an Anthropic adapter (OAuth CLI-proxy or direct), if the error body is the stale-thinking rejection, strip all thinking/redacted_thinking from the request, disable extended thinking for that turn, and retry once against the same upstream. Non-thinking 400s and non-Anthropic providers are unaffected. - anthropic_error_is_stale_thinking(): classify the 400 body - anthropic_body_drop_thinking_and_disable(): strip + set thinking disabled - guarded inline retry in the chain loop (mutable upstream_resp/status) - unit tests for detection and strip/disable
Claude Code's LLM calls go through the external cli-proxy, so the proxy-side thinking strip/retry never sees them. When a resumed claudecode mission replays a session transcript whose thinking blocks were signed under a different model, Anthropic returns 400 "thinking ... cannot be modified" and the mission hard-fails. Route that error into the existing ResetSessionFresh transport-recovery path: - is_stale_thinking_error(): detect the rejection in the turn output - claudecode_transport_recovery_strategy: on stale-thinking, escalate straight to a fresh session (skip same-session resume, which would replay the same rejected blocks); the existing reset path rebuilds context as text and drops the signed thinking, so the turn succeeds. Adds a unit test.
- oauth_callback (UUID reconnect): don't clobber stored api_key/oauth with None when the callback produced no fresh credentials (e.g. a failed auth.json sync that still reported success) — only replace when fresh creds were actually extracted. - oauth_callback (UUID reconnect): never fall through to `add` when an existing UUID was targeted; a missing row or failed update now returns an explicit 404/500 instead of inserting a duplicate account for the same OAuth completion. - upsert_grok_oauth_provider: accept a target_id and prefer that row, so an xAI reconnect updates the clicked row (which the health probe checks) instead of the first enabled OAuth xAI account.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 51e62f1. Configure here.
| chain_length, | ||
| }); | ||
| client_error_count += 1; | ||
| continue; |
There was a problem hiding this comment.
CLI proxy 400s trigger cooldown
Medium Severity
For Anthropic OAuth CLI-proxy routing, any HTTP 400 that is not classified as stale-thinking now calls record_entry_failure with ClientError and skips the rest of the chain entry. That path used to fall through to the generic 4xx handler, which intentionally avoids cooldowns. Unrelated validation 400s can temporarily sideline otherwise healthy OAuth accounts.
Reviewed by Cursor Bugbot for commit 51e62f1. Configure here.
| } else { | ||
| build_anthropic_upstream_request(&body, &entry.model_id, is_stream) | ||
| }; | ||
| base.and_then(|b| anthropic_body_drop_thinking_and_disable(&b)) |
There was a problem hiding this comment.
CLI proxy retry wrong body format
Medium Severity
On a stale-thinking HTTP 400, the OAuth CLI-proxy branch builds a retry from rewrite_model_for_anthropic_cli_proxy (OpenAI /v1/chat/completions JSON) but then passes it through anthropic_body_drop_thinking_and_disable, which injects Anthropic Messages API fields such as top-level thinking. That retry is posted back to the CLI proxy, so the recovery path for Anthropic OAuth CLI routing cannot reliably fix stale thinking blocks.
Reviewed by Cursor Bugbot for commit 51e62f1. Configure here.


Problem
On
settings/providers, clicking Reconnect for an OAuth provider (xAI/Grok, Anthropic) with an expired/revoked token produced:instead of opening the OAuth link.
Root cause
The Reconnect button calls
POST /api/ai/providers/:id/auth, whose only check ishas_credentials()— which returnstruewhenever anoauthblob merely exists, even if expired/revoked. So the endpoint returned{success: true, auth_url: null}, the frontend skipped opening the link, ran a live usage probe, and surfaced the probe's failure.Fix
Frontend
ReconnectProviderModalthat drives the realoauthAuthorize → confirm-code → oauthCallbackflow already used by the add-provider modal.uses_oauth && !has_api_key) route to it; API-key providers keep the legacy path.ProviderType::auth_methods()ordering (Anthropic Pro/Max vs console mode resolves correctly); single-method providers (xAI) auto-start.Backend
oauth_callbackupdates the existing provider in place when the path id is a known UUID (what Reconnect sends), instead of always inserting a new row — prevents duplicate provider entries on reconnect. The add-provider flow passes a type id (not a UUID) and still falls through toadd().Testing
Deployed to the dev backend and verified against the live xAI provider stuck in
needs_reauth:POST /:id/auth{"success":true,...,"auth_url":null}POST /:id/oauth/authorizeaccounts.x.ai/oauth2/device?user_code=…; Anthropic →claude.ai/oauth/authorize(Pro/Max) andconsole.anthropic.com/oauth/authorize(API key)cargo check+cargo fmt --all --checkclean; backend builds on Linux, deploys to dev, service healthy.tsc --noEmit+eslintclean; full Next.js production build passes with/settings/providerspresent.Note: the literal button click and the duplicate-row dedup end-to-end were not automated (browser auth gate / would mint real tokens).
Note
Medium Risk
Touches OAuth credential persistence and in-place provider updates (auth-critical), plus Anthropic request rewriting and session reset behavior on the inference path.
Overview
Reconnect on the providers settings page now runs the real OAuth authorize → callback path for OAuth-only providers (xAI, Anthropic, etc.) instead of
POST …/auth, which treated expired tokens as “authenticated” and never opened the auth link. A dedicated ReconnectProviderModal mirrors the add-provider OAuth UX (method indices aligned with the backend); post-reconnect health probing is shared viaprobeProviderHealth.On the backend, OAuth callback accepts the provider UUID from reconnect and updates that row in place (including xAI Grok upsert by
target_id), avoiding duplicate provider rows. API-key reconnect still uses the legacy auth endpoint.Separately, the Anthropic proxy and mission runner gain handling for stale extended-thinking blocks when the model changes or blocks are replayed: strip thinking on model rewrite, preserve thinking in the OpenAI→Anthropic adapter, one-shot retry with thinking disabled after the specific 400, and Claude Code transport recovery that resets to a fresh session instead of resuming when that error appears in turn output.
Reviewed by Cursor Bugbot for commit 51e62f1. Bugbot is set up for automated code reviews on this repo. Configure here.