v1.15.3 — P10 defensive integration test for rescue payload contract#55
Merged
Conversation
Adds env-gated integration smoke test in test/integration/real-gemini.smoke.test.ts
that exercises the EXACT v1.14.4 G2-fix payload shape against the live
Gemini API:
toolConfig: { functionCallingConfig: { mode: NONE } }, no `tools`.
Closes the last v1.14.4-cycle followup (gemini-chat F4 Round-1: unit
test verifies request SHAPE only). Per Google docs: "NONE: equivalent
to sending a request without any function declarations." If Google
flips the contract, this test fails loudly during release validation
before customer impact.
Test uses a single-turn plain-text conversation (not synthetic
multi-turn tool-call history) because Google's API now rejects
unsigned synthetic functionCall parts (thought_signature contract).
Production rescue paths replay GENUINE conversations built by prior
loop iterations — their tool calls are correctly signed.
Coverage: 756 pass | 9 skipped (was 755). +1 integration test (skipped
without GEMINI_API_KEY; runs locally with).
Pure test addition. No API/schema/code change. dist/ byte-identical
to v1.15.2 except version string.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new env-gated integration smoke test to defensively pin the Gemini API contract for the ask_agentic forced-finalization “rescue” payload shape (toolConfig.functionCallingConfig.mode = NONE with no tools field), plus bumps the package version to v1.15.3 and records the release in the changelog.
Changes:
- Add a new live-API integration test asserting
generateContentacceptstoolConfig: NONEwhile omittingtools. - Bump versions to
1.15.3(package.json,server.json). - Document the release in
CHANGELOG.md.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| test/integration/real-gemini.smoke.test.ts | Adds the new env-gated smoke test covering the “no tools + NONE” contract. |
| CHANGELOG.md | Adds the 1.15.3 release entry describing the new defensive integration test. |
| package.json | Version bump to 1.15.3. |
| server.json | Version bump to 1.15.3 for the MCP server manifest. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+200
to
+216
| // P10 (v1.15.3): defensive smoke test for ask_agentic's forced-finalization | ||
| // rescue payload contract. v1.14.4 review surfaced 2-of-3 reviewers | ||
| // claiming `toolConfig: { mode: NONE }` without `tools` would 400 — refuted | ||
| // empirically by 5+ successful rescue runs. The unit test for the rescue | ||
| // config (G2 fix in ask-agentic.test.ts:782+) verifies request SHAPE only; | ||
| // it can't catch a future Google API contract regression. | ||
| // | ||
| // This integration test exercises the EXACT payload shape the rescue uses | ||
| // against the live API. CI without the GEMINI_API_KEY env skips silently; | ||
| // local / release-validation runs catch a contract flip before customer | ||
| // impact. Per Google docs (ai.google.dev/gemini-api/docs/function-calling): | ||
| // "NONE: equivalent to sending a request without any function declarations." | ||
| it('forced-finalization rescue payload (toolConfig: NONE without tools) is accepted (P10/v1.15.3)', async () => { | ||
| const resolved = await resolveModel('latest-flash', client); | ||
| // The load-bearing claim under test: Gemini API accepts a `generateContent` | ||
| // request with `toolConfig: { mode: NONE }` AND no `tools` field. v1.14.4 | ||
| // review surfaced 2-of-3 reviewers (gemini-cli + gemini-chat) claiming this |
Comment on lines
+204
to
+205
| // config (G2 fix in ask-agentic.test.ts:782+) verifies request SHAPE only; | ||
| // it can't catch a future Google API contract regression. |
Comment on lines
+12
to
+16
| Closes the last queued v1.14.4 review followup — gemini-chat F4 (Round-1) flagged that the unit test for the rescue config (G2 fix at `test/unit/ask-agentic.test.ts:782+`) verifies request SHAPE only and would not catch a future Google API contract regression on the rescue payload. v1.15.3 adds an env-gated integration smoke test that exercises the EXACT load-bearing payload shape against the live API. | ||
|
|
||
| - **New test case** in `test/integration/real-gemini.smoke.test.ts`: *"forced-finalization rescue payload (toolConfig: NONE without tools) is accepted (P10/v1.15.3)"*. Skips silently in CI (no `GEMINI_API_KEY`) so the standard `npm test` run is unaffected; runs locally / during release validation when the key is present. | ||
| - **What it pins**: the v1.14.4 G2 contract — `generateContent` with `toolConfig: { functionCallingConfig: { mode: NONE } }` AND no `tools` field MUST be accepted by the live API. Per Google docs (`ai.google.dev/gemini-api/docs/function-calling`): *"NONE: equivalent to sending a request without any function declarations."* If Google ever flips this contract, CI on a release-validation run catches the regression before customer impact. | ||
| - **Single-turn conversation shape** (not multi-turn synthetic tool-call history): Google's API recently started rejecting synthetic `functionCall` parts without a `thought_signature` (per `ai.google.dev/gemini-api/docs/thought-signatures`, replay-safety contract). Production rescue paths replay GENUINE accumulated conversations built by prior loop iterations, so their tool-call parts are correctly signed. The smoke test uses a plain user-text turn instead — sufficient to exercise the load-bearing toolConfig + no-tools claim without the synthetic-history signing complication. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the last v1.14.4-cycle followup. Adds an env-gated integration smoke test that exercises the v1.14.4 G2-fix payload shape (
toolConfig: { mode: NONE }+ notools) against the live Gemini API. Catches a future Google API contract regression on this specific shape before customer impact.Scope
Testing
756 passed | 9 skipped (was 755 in v1.15.2 — +1 integration test).
Skipped without
GEMINI_API_KEY(CI-safe). Runs locally withGEMINI_API_KEY=AIza... npm run test:integration.Pure additive test;
dist/byte-identical to v1.15.2 except version string.Pre-publish audit: clean.
Unit tests added/updated (no — purely integration)
Integration tests added/updated
npm run lintpassesnpm run typecheckpassesnpm run testpassesBackwards compatibility
tools/listreturns same field shapes.🤖 Generated with Claude Code