feat(provider): add native Google Vertex AI Gemini provider#1428
Open
gitdevine wants to merge 6 commits into
Open
feat(provider): add native Google Vertex AI Gemini provider#1428gitdevine wants to merge 6 commits into
gitdevine wants to merge 6 commits into
Conversation
Adds a `gemini-vertex` transport that routes requests directly to the Google Vertex AI Gemini API (`generateContent`) instead of going through an OpenAI-compatible shim. Key additions: - `src/integrations/vendors/gemini-vertex.ts` — vendor descriptor with ADC auth, regional/global endpoint routing, model catalog - `src/services/api/geminiVertexClient.ts` — native Vertex client with Anthropic-compatible surface (messages.create, beta.messages.create, withResponse streaming contract); extracts usageMetadata token counts - `src/utils/geminiAuth.ts` — getGeminiVertexLocation/Model/ProjectId helpers and DEFAULT_GEMINI_VERTEX_MODEL constant - `src/utils/providerProfile.ts` — buildGeminiVertexProfileEnv(), gemini-vertex branch in buildLaunchEnv() - `src/utils/providerProfiles.ts` — resolveProfileCompatibility, applyProviderProfileToProcessEnv, isProcessEnvAlignedWithProfile all handle the gemini-vertex case - `src/services/api/client.ts` — routes to GeminiVertexClient when CLAUDE_CODE_USE_GEMINI_VERTEX=1 is set - `src/services/api/providerConfig.ts` — exposes vertexProject and vertexLocation in ResolvedProviderRequest - `src/integrations/generated/integrationArtifacts.generated.ts` — registers gemini-vertex vendor and preset - UI: provider wizard, ProviderManager, buildCurrentProviderSummary all show/handle Google Vertex AI Gemini option - 193 unit tests pass; real Vertex AI call verified (Paris, tokens 15/1) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…point only) gemini-3.5-flash is available on the global Vertex AI endpoint but not regional ones. Verified live: works with location=global. It is a thinking model requiring at least ~1000+ output tokens. Also added contextWindow/maxOutputTokens metadata to gemini-2.5-pro. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Adds gemini-vertex to recognized profiles in provider-launch.ts - Validates that GEMINI_VERTEX_PROJECT is set before launching - Adds printSummary message for gemini-vertex - Adds dev:gemini-vertex npm script in package.json Usage: bun run dev:gemini-vertex Requires: GEMINI_VERTEX_PROJECT set in env or saved profile Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Build on the native gemini-vertex provider with the user-facing and defensive fixes needed to actually use it end-to-end. ProviderManager.tsx - New preset-gemini-vertex-project screen: ask for the Google Cloud project ID before the model step (pre-fills from GOOGLE_CLOUD_PROJECT/GCLOUD_PROJECT/GOOGLE_PROJECT_ID). - Skip preset-api-key for gemini-vertex (it uses ADC, not an API key). - Esc navigation flows: choose -> project -> model -> choose. providerProfiles.ts - applyProviderProfileToProcessEnv now uses profile.baseUrl as the GEMINI_VERTEX_PROJECT when activating a gemini-vertex profile mid session (matches how startup applies it). client.ts (intent guard) - Read the active provider profile and force the Vertex Gemini branch when its provider is 'gemini-vertex', even when CLAUDE_CODE_USE_VERTEX is set in the shell. Conversely, the Anthropic Vertex branch refuses to fire when the active profile is gemini-vertex. Prevents stale shell setx CLAUDE_CODE_USE_VERTEX=1 from routing Gemini requests to publishers/anthropic/models/* (404). providerProfile.ts - Add CLOUD_ML_REGION and ANTHROPIC_VERTEX_PROJECT_ID to PROFILE_ENV_KEYS so they get cleared on every profile switch, never letting a stale region (e.g. typoed 'glogal') leak across profiles. geminiVertexClient.ts - Pass openclaude's system prompt to Vertex as systemInstruction instead of dropping it. - Raise maxOutputTokens floor to 8192 for thinking-capable models (gemini-3.x, gemini-2.5-pro) so internal thinking doesn't eat the entire budget on short turns. - Surface every silent-empty-response path as a descriptive error: prompt-level block, MAX_TOKENS with thinking-budget hint, SAFETY/RECITATION/BLOCKLIST refusals, and the empty-text fallback. Without this the empty assistant message is filtered downstream and the user sees nothing. model.ts - gemini-vertex cases in getSmallFastModel, getDefaultHaikuModel, getDefaultSonnetModel, getDefaultOpusModel, getDefaultMainLoopModelSetting, getUserSpecifiedModelSetting. Without these the background helpers fell back to claude-* names (which don't exist on Vertex Gemini). - Defensive guard at the top of firstPartyNameToCanonical. tokenEstimation.ts - Defensive guard in getTokenizerConfig so an unexpected undefined model doesn't take the render layer down. messages.ts - default branch in normalizeMessages so unrecognised message types no longer leak undefined slots into the result array. - Defensive guard at the top of isNotEmptyMessage. provider.tsx - Legacy /provider wizard kept consistent with the new 3-step flow.
…Calls
Without tools the Vertex Gemini client could chat but every agentic prompt
(which always declares tools) crashed with finishReason=MALFORMED_FUNCTION_CALL:
the model tried to emit a function call, Vertex had no declared tools to
match it against, and finished with malformed output and no text. The user
saw nothing or a useless error.
This adds round-trip tool support so the same client actually works for
openclaude's agent loop.
Request side
- toVertexSchema(): translate Anthropic input_schema (lowercase JSON
Schema) to Vertex's UPPERCASE OpenAPI-style schema. Recursively walks
properties/items/anyOf and drops fields Vertex rejects ($schema, $ref,
additionalProperties, allOf collapsed to anyOf, etc.) instead of
failing the whole request on one odd field.
- toGeminiTools(): build `tools: [{ functionDeclarations: [...] }]` from
params.tools. Drops tools without an input_schema so server-side
connector tools don't pollute the declaration list.
- toGeminiToolConfig(): map Anthropic tool_choice (auto/any/tool/none)
to Vertex's toolConfig.functionCallingConfig with mode + allowedFunctionNames.
History side (toGeminiContents)
- assistant tool_use blocks → model functionCall parts. Keeps a
toolUseId → name map so the matching tool_result wires up correctly.
- user tool_result blocks → user functionResponse parts using the
remembered function name.
- Stringifies array-shaped tool_result content (each {text} block joined)
so structured tool output reaches the model.
Response side
- extractContentBlocks(): build Anthropic-shaped content blocks (text +
tool_use) preserving order. Synthesises stable toolu_vertex_* ids so
the agent loop can wire the follow-up tool_result back.
- GeminiVertexMessage.content now allows tool_use blocks; stop_reason is
'tool_use' on function-call turns so the agent loop runs the tool
instead of treating it as a final reply.
- toAnthropicStream loops over all blocks and emits one
start/delta/stop trio per block (text_delta or input_json_delta).
- Empty-response guards no longer fire on function-call-only turns —
those are valid agent output.
- Added MALFORMED_FUNCTION_CALL to the descriptive error list (it should
no longer fire now that tools are forwarded, but if a custom tool
produces an unparseable schema we surface a useful message).
Tests
- New: tools are translated with UPPERCASE types and additionalProperties
is dropped.
- New: a functionCall-only response becomes a tool_use content block with
stop_reason='tool_use'.
- Updated: maxOutputTokens=321 boosted to 8192 for the gemini-3.5-flash
thinking-model floor (existing behaviour, now reflected in the test).
- translate tool schemas via an allowlist (drop propertyNames/$schema/…, const→enum, exclusiveMinimum→minimum) to satisfy Vertex's OpenAPI subset - round-trip Gemini thinking thoughtSignature through tool_use ids - clamp thinking-model temperature to 1.0 (Google warns <1.0 degrades/loops) - keep functionResponse turns pure (split trailing text) so the model does not return an empty STOP response after a tool result - add a non-sensitive diagnostic to empty-response errors Provider path tested: gemini-3.5-flash on Vertex (global endpoint), real greeting + Skill tool call. Checks: bun test geminiVertexClient.test.ts (12 pass), bun run build. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
a16f0bb to
653434a
Compare
|
The failing Proof:
The same commit also has a passing run, i.e. these are flaky: they pass in None of these tests are related to the Gemini Vertex provider. This PR's own |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a native Google Vertex AI Gemini provider that calls Vertex's
publishers/google/models/{model}:generateContentdirectly (ADC auth), with afull Anthropic-shaped request/response + streaming contract, tool calling, and
support for Gemini 3 "thinking" models.
Why
Routing Gemini through the Anthropic Vertex path (
publishers/anthropic) onlyworks for Claude models. A native transport lets OpenClaude drive Gemini on
Vertex end-to-end (text + tools), including the
gemini-3.x/gemini-2.5-prothinking models.
Highlights / impact
contents,systemInstruction,tools,toolConfig,generationConfig) + streaming.(UPPERCASE types;
const→enum;exclusiveMinimum→minimum; dropspropertyNames/$schema/additionalProperties/…).thoughtSignatureround-trip,temperaturefloored to 1.0 (Google warns <1.0 degrades/loops),
maxOutputTokensflooredso the model can think and answer.
functionResponseturns kept pure (trailing reminder text split into itsown turn) — otherwise Vertex returns an empty
STOPresponse after a toolresult.
Providers affected
Only the new
gemini-vertexfirst-party path. Existing providers (Anthropic,OpenAI, Gemini-via-shim, MiniMax, Ollama, …) are untouched; the Anthropic
Vertex (Claude) path is explicitly guarded so Gemini routing can't hijack it.
Tested
gemini-3.5-flashon Vertex (global endpoint) — greeting + Skilltool-call round-trip working end-to-end.
bun test src/services/api/geminiVertexClient.test.ts(12 pass)bun test src/services/api/client.test.ts(20 pass)bun run smokeFollow-up / limitations
thoughtSignatureis smuggled through thetool_useid to survive theAnthropic message round-trip; a dedicated field would be cleaner if the
message schema gains one.
rather than incremental SSE from Vertex.
🤖 Generated with Claude Code