feat(provider): add native Google Vertex AI Gemini provider by gitdevine · Pull Request #1428 · Gitlawb/openclaude

gitdevine · 2026-05-29T12:35:19Z

What

Adds a native Google Vertex AI Gemini provider that calls Vertex's
publishers/google/models/{model}:generateContent directly (ADC auth), with a
full Anthropic-shaped request/response + streaming contract, tool calling, and
support for Gemini 3 "thinking" models.

Why

Routing Gemini through the Anthropic Vertex path (publishers/anthropic) only
works for Claude models. A native transport lets OpenClaude drive Gemini on
Vertex end-to-end (text + tools), including the gemini-3.x / gemini-2.5-pro
thinking models.

Highlights / impact

Provider wizard step for project + location (uses ADC — no API key).
Native client: Anthropic ⇄ Vertex translation (contents,
systemInstruction, tools, toolConfig, generationConfig) + streaming.
Tool schemas translated via an allowlist into Vertex's OpenAPI subset
(UPPERCASE types; const→enum; exclusiveMinimum→minimum; drops
propertyNames/$schema/additionalProperties/…).
Gemini "thinking" handling: thoughtSignature round-trip, temperature
floored to 1.0 (Google warns <1.0 degrades/loops), maxOutputTokens floored
so the model can think and answer.
functionResponse turns kept pure (trailing reminder text split into its
own turn) — otherwise Vertex returns an empty STOP response after a tool
result.
Explicit, diagnosable empty-response errors (no silent failures).

Providers affected

Only the new gemini-vertex first-party path. Existing providers (Anthropic,
OpenAI, Gemini-via-shim, MiniMax, Ollama, …) are untouched; the Anthropic
Vertex (Claude) path is explicitly guarded so Gemini routing can't hijack it.

Tested

Real run: gemini-3.5-flash on Vertex (global endpoint) — greeting + Skill
tool-call round-trip working end-to-end.
bun test src/services/api/geminiVertexClient.test.ts (12 pass)
bun test src/services/api/client.test.ts (20 pass)
bun run smoke

Follow-up / limitations

thoughtSignature is smuggled through the tool_use id to survive the
Anthropic message round-trip; a dedicated field would be cleaner if the
message schema gains one.
Streaming currently emits the aggregated response as Anthropic stream events
rather than incremental SSE from Vertex.

🤖 Generated with Claude Code

Adds a `gemini-vertex` transport that routes requests directly to the Google Vertex AI Gemini API (`generateContent`) instead of going through an OpenAI-compatible shim. Key additions: - `src/integrations/vendors/gemini-vertex.ts` — vendor descriptor with ADC auth, regional/global endpoint routing, model catalog - `src/services/api/geminiVertexClient.ts` — native Vertex client with Anthropic-compatible surface (messages.create, beta.messages.create, withResponse streaming contract); extracts usageMetadata token counts - `src/utils/geminiAuth.ts` — getGeminiVertexLocation/Model/ProjectId helpers and DEFAULT_GEMINI_VERTEX_MODEL constant - `src/utils/providerProfile.ts` — buildGeminiVertexProfileEnv(), gemini-vertex branch in buildLaunchEnv() - `src/utils/providerProfiles.ts` — resolveProfileCompatibility, applyProviderProfileToProcessEnv, isProcessEnvAlignedWithProfile all handle the gemini-vertex case - `src/services/api/client.ts` — routes to GeminiVertexClient when CLAUDE_CODE_USE_GEMINI_VERTEX=1 is set - `src/services/api/providerConfig.ts` — exposes vertexProject and vertexLocation in ResolvedProviderRequest - `src/integrations/generated/integrationArtifacts.generated.ts` — registers gemini-vertex vendor and preset - UI: provider wizard, ProviderManager, buildCurrentProviderSummary all show/handle Google Vertex AI Gemini option - 193 unit tests pass; real Vertex AI call verified (Paris, tokens 15/1) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…point only) gemini-3.5-flash is available on the global Vertex AI endpoint but not regional ones. Verified live: works with location=global. It is a thinking model requiring at least ~1000+ output tokens. Also added contextWindow/maxOutputTokens metadata to gemini-2.5-pro. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Adds gemini-vertex to recognized profiles in provider-launch.ts - Validates that GEMINI_VERTEX_PROJECT is set before launching - Adds printSummary message for gemini-vertex - Adds dev:gemini-vertex npm script in package.json Usage: bun run dev:gemini-vertex Requires: GEMINI_VERTEX_PROJECT set in env or saved profile Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Build on the native gemini-vertex provider with the user-facing and defensive fixes needed to actually use it end-to-end. ProviderManager.tsx - New preset-gemini-vertex-project screen: ask for the Google Cloud project ID before the model step (pre-fills from GOOGLE_CLOUD_PROJECT/GCLOUD_PROJECT/GOOGLE_PROJECT_ID). - Skip preset-api-key for gemini-vertex (it uses ADC, not an API key). - Esc navigation flows: choose -> project -> model -> choose. providerProfiles.ts - applyProviderProfileToProcessEnv now uses profile.baseUrl as the GEMINI_VERTEX_PROJECT when activating a gemini-vertex profile mid session (matches how startup applies it). client.ts (intent guard) - Read the active provider profile and force the Vertex Gemini branch when its provider is 'gemini-vertex', even when CLAUDE_CODE_USE_VERTEX is set in the shell. Conversely, the Anthropic Vertex branch refuses to fire when the active profile is gemini-vertex. Prevents stale shell setx CLAUDE_CODE_USE_VERTEX=1 from routing Gemini requests to publishers/anthropic/models/* (404). providerProfile.ts - Add CLOUD_ML_REGION and ANTHROPIC_VERTEX_PROJECT_ID to PROFILE_ENV_KEYS so they get cleared on every profile switch, never letting a stale region (e.g. typoed 'glogal') leak across profiles. geminiVertexClient.ts - Pass openclaude's system prompt to Vertex as systemInstruction instead of dropping it. - Raise maxOutputTokens floor to 8192 for thinking-capable models (gemini-3.x, gemini-2.5-pro) so internal thinking doesn't eat the entire budget on short turns. - Surface every silent-empty-response path as a descriptive error: prompt-level block, MAX_TOKENS with thinking-budget hint, SAFETY/RECITATION/BLOCKLIST refusals, and the empty-text fallback. Without this the empty assistant message is filtered downstream and the user sees nothing. model.ts - gemini-vertex cases in getSmallFastModel, getDefaultHaikuModel, getDefaultSonnetModel, getDefaultOpusModel, getDefaultMainLoopModelSetting, getUserSpecifiedModelSetting. Without these the background helpers fell back to claude-* names (which don't exist on Vertex Gemini). - Defensive guard at the top of firstPartyNameToCanonical. tokenEstimation.ts - Defensive guard in getTokenizerConfig so an unexpected undefined model doesn't take the render layer down. messages.ts - default branch in normalizeMessages so unrecognised message types no longer leak undefined slots into the result array. - Defensive guard at the top of isNotEmptyMessage. provider.tsx - Legacy /provider wizard kept consistent with the new 3-step flow.

…Calls Without tools the Vertex Gemini client could chat but every agentic prompt (which always declares tools) crashed with finishReason=MALFORMED_FUNCTION_CALL: the model tried to emit a function call, Vertex had no declared tools to match it against, and finished with malformed output and no text. The user saw nothing or a useless error. This adds round-trip tool support so the same client actually works for openclaude's agent loop. Request side - toVertexSchema(): translate Anthropic input_schema (lowercase JSON Schema) to Vertex's UPPERCASE OpenAPI-style schema. Recursively walks properties/items/anyOf and drops fields Vertex rejects ($schema, $ref, additionalProperties, allOf collapsed to anyOf, etc.) instead of failing the whole request on one odd field. - toGeminiTools(): build `tools: [{ functionDeclarations: [...] }]` from params.tools. Drops tools without an input_schema so server-side connector tools don't pollute the declaration list. - toGeminiToolConfig(): map Anthropic tool_choice (auto/any/tool/none) to Vertex's toolConfig.functionCallingConfig with mode + allowedFunctionNames. History side (toGeminiContents) - assistant tool_use blocks → model functionCall parts. Keeps a toolUseId → name map so the matching tool_result wires up correctly. - user tool_result blocks → user functionResponse parts using the remembered function name. - Stringifies array-shaped tool_result content (each {text} block joined) so structured tool output reaches the model. Response side - extractContentBlocks(): build Anthropic-shaped content blocks (text + tool_use) preserving order. Synthesises stable toolu_vertex_* ids so the agent loop can wire the follow-up tool_result back. - GeminiVertexMessage.content now allows tool_use blocks; stop_reason is 'tool_use' on function-call turns so the agent loop runs the tool instead of treating it as a final reply. - toAnthropicStream loops over all blocks and emits one start/delta/stop trio per block (text_delta or input_json_delta). - Empty-response guards no longer fire on function-call-only turns — those are valid agent output. - Added MALFORMED_FUNCTION_CALL to the descriptive error list (it should no longer fire now that tools are forwarded, but if a custom tool produces an unparseable schema we surface a useful message). Tests - New: tools are translated with UPPERCASE types and additionalProperties is dropped. - New: a functionCall-only response becomes a tool_use content block with stop_reason='tool_use'. - Updated: maxOutputTokens=321 boosted to 8192 for the gemini-3.5-flash thinking-model floor (existing behaviour, now reflected in the test).

- translate tool schemas via an allowlist (drop propertyNames/$schema/…, const→enum, exclusiveMinimum→minimum) to satisfy Vertex's OpenAPI subset - round-trip Gemini thinking thoughtSignature through tool_use ids - clamp thinking-model temperature to 1.0 (Google warns <1.0 degrades/loops) - keep functionResponse turns pure (split trailing text) so the model does not return an empty STOP response after a tool result - add a non-sensitive diagnostic to empty-response errors Provider path tested: gemini-3.5-flash on Vertex (global endpoint), real greeting + Skill tool call. Checks: bun test geminiVertexClient.test.ts (12 pass), bun run build. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

openbrandev · 2026-05-29T13:34:22Z

The failing smoke-and-tests check is pre-existing flakiness on main, not
introduced by this PR.

Proof: main's own PR Checks run at the exact commit this branch is based on
(70b4b07), with none of my changes, fails on the same four tests:
https://github.com/Gitlawb/openclaude/actions/runs/26583255103

getAttributionTexts > preserves includeCoAuthoredBy true as an explicit old-default opt-in
active auto-compact cooldown blocks before model call with cooldown guidance
auto-compact cooldown tracking is carried into the next query call
breaker metadata tracking callback publishes a fresh object

The same commit also has a passing run, i.e. these are flaky: they pass in
isolation but fail when the whole suite runs in one process
(bun test --max-concurrency=1), which points to cross-test state leakage
(shared process.env / the shared mutation lock), not a real regression.

None of these tests are related to the Gemini Vertex provider. This PR's own
tests pass (geminiVertexClient.test.ts, plus the gemini-vertex cases in
client.test.ts).

gitdevine and others added 6 commits May 29, 2026 14:32

gitdevine force-pushed the feat/native-gemini-vertex branch from a16f0bb to 653434a Compare May 29, 2026 12:58

gitdevine closed this May 29, 2026

gitdevine reopened this May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(provider): add native Google Vertex AI Gemini provider#1428

feat(provider): add native Google Vertex AI Gemini provider#1428
gitdevine wants to merge 6 commits into
Gitlawb:mainfrom
gitdevine:feat/native-gemini-vertex

gitdevine commented May 29, 2026

Uh oh!

openbrandev commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gitdevine commented May 29, 2026

What

Why

Highlights / impact

Providers affected

Tested

Follow-up / limitations

Uh oh!

openbrandev commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants