Skip to content

v1.15.3 — P10 defensive integration test for rescue payload contract#55

Merged
qmt merged 1 commit into
mainfrom
v1.15.3-integration-smoke-test
Apr 30, 2026
Merged

v1.15.3 — P10 defensive integration test for rescue payload contract#55
qmt merged 1 commit into
mainfrom
v1.15.3-integration-smoke-test

Conversation

@qmt
Copy link
Copy Markdown
Member

@qmt qmt commented Apr 30, 2026

Summary

Closes the last v1.14.4-cycle followup. Adds an env-gated integration smoke test that exercises the v1.14.4 G2-fix payload shape (toolConfig: { mode: NONE } + no tools) against the live Gemini API. Catches a future Google API contract regression on this specific shape before customer impact.

Scope

  • Bug fix (defensive test infrastructure)
  • New feature
  • Refactor
  • Documentation
  • Breaking change

Testing

  • 756 passed | 9 skipped (was 755 in v1.15.2 — +1 integration test).

  • Skipped without GEMINI_API_KEY (CI-safe). Runs locally with GEMINI_API_KEY=AIza... npm run test:integration.

  • Pure additive test; dist/ byte-identical to v1.15.2 except version string.

  • Pre-publish audit: clean.

  • Unit tests added/updated (no — purely integration)

  • Integration tests added/updated

  • npm run lint passes

  • npm run typecheck passes

  • npm run test passes

Backwards compatibility

  • No API/schema/code change.
  • tools/list returns same field shapes.

🤖 Generated with Claude Code

Adds env-gated integration smoke test in test/integration/real-gemini.smoke.test.ts
that exercises the EXACT v1.14.4 G2-fix payload shape against the live
Gemini API:
  toolConfig: { functionCallingConfig: { mode: NONE } }, no `tools`.

Closes the last v1.14.4-cycle followup (gemini-chat F4 Round-1: unit
test verifies request SHAPE only). Per Google docs: "NONE: equivalent
to sending a request without any function declarations." If Google
flips the contract, this test fails loudly during release validation
before customer impact.

Test uses a single-turn plain-text conversation (not synthetic
multi-turn tool-call history) because Google's API now rejects
unsigned synthetic functionCall parts (thought_signature contract).
Production rescue paths replay GENUINE conversations built by prior
loop iterations — their tool calls are correctly signed.

Coverage: 756 pass | 9 skipped (was 755). +1 integration test (skipped
without GEMINI_API_KEY; runs locally with).

Pure test addition. No API/schema/code change. dist/ byte-identical
to v1.15.2 except version string.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@qmt qmt requested a review from Copilot April 30, 2026 20:31
@qmt qmt merged commit 362eb06 into main Apr 30, 2026
6 checks passed
@qmt qmt deleted the v1.15.3-integration-smoke-test branch April 30, 2026 20:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new env-gated integration smoke test to defensively pin the Gemini API contract for the ask_agentic forced-finalization “rescue” payload shape (toolConfig.functionCallingConfig.mode = NONE with no tools field), plus bumps the package version to v1.15.3 and records the release in the changelog.

Changes:

  • Add a new live-API integration test asserting generateContent accepts toolConfig: NONE while omitting tools.
  • Bump versions to 1.15.3 (package.json, server.json).
  • Document the release in CHANGELOG.md.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
test/integration/real-gemini.smoke.test.ts Adds the new env-gated smoke test covering the “no tools + NONE” contract.
CHANGELOG.md Adds the 1.15.3 release entry describing the new defensive integration test.
package.json Version bump to 1.15.3.
server.json Version bump to 1.15.3 for the MCP server manifest.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +200 to +216
// P10 (v1.15.3): defensive smoke test for ask_agentic's forced-finalization
// rescue payload contract. v1.14.4 review surfaced 2-of-3 reviewers
// claiming `toolConfig: { mode: NONE }` without `tools` would 400 — refuted
// empirically by 5+ successful rescue runs. The unit test for the rescue
// config (G2 fix in ask-agentic.test.ts:782+) verifies request SHAPE only;
// it can't catch a future Google API contract regression.
//
// This integration test exercises the EXACT payload shape the rescue uses
// against the live API. CI without the GEMINI_API_KEY env skips silently;
// local / release-validation runs catch a contract flip before customer
// impact. Per Google docs (ai.google.dev/gemini-api/docs/function-calling):
// "NONE: equivalent to sending a request without any function declarations."
it('forced-finalization rescue payload (toolConfig: NONE without tools) is accepted (P10/v1.15.3)', async () => {
const resolved = await resolveModel('latest-flash', client);
// The load-bearing claim under test: Gemini API accepts a `generateContent`
// request with `toolConfig: { mode: NONE }` AND no `tools` field. v1.14.4
// review surfaced 2-of-3 reviewers (gemini-cli + gemini-chat) claiming this
Comment on lines +204 to +205
// config (G2 fix in ask-agentic.test.ts:782+) verifies request SHAPE only;
// it can't catch a future Google API contract regression.
Comment thread CHANGELOG.md
Comment on lines +12 to +16
Closes the last queued v1.14.4 review followup — gemini-chat F4 (Round-1) flagged that the unit test for the rescue config (G2 fix at `test/unit/ask-agentic.test.ts:782+`) verifies request SHAPE only and would not catch a future Google API contract regression on the rescue payload. v1.15.3 adds an env-gated integration smoke test that exercises the EXACT load-bearing payload shape against the live API.

- **New test case** in `test/integration/real-gemini.smoke.test.ts`: *"forced-finalization rescue payload (toolConfig: NONE without tools) is accepted (P10/v1.15.3)"*. Skips silently in CI (no `GEMINI_API_KEY`) so the standard `npm test` run is unaffected; runs locally / during release validation when the key is present.
- **What it pins**: the v1.14.4 G2 contract — `generateContent` with `toolConfig: { functionCallingConfig: { mode: NONE } }` AND no `tools` field MUST be accepted by the live API. Per Google docs (`ai.google.dev/gemini-api/docs/function-calling`): *"NONE: equivalent to sending a request without any function declarations."* If Google ever flips this contract, CI on a release-validation run catches the regression before customer impact.
- **Single-turn conversation shape** (not multi-turn synthetic tool-call history): Google's API recently started rejecting synthetic `functionCall` parts without a `thought_signature` (per `ai.google.dev/gemini-api/docs/thought-signatures`, replay-safety contract). Production rescue paths replay GENUINE accumulated conversations built by prior loop iterations, so their tool-call parts are correctly signed. The smoke test uses a plain user-text turn instead — sufficient to exercise the load-bearing toolConfig + no-tools claim without the synthetic-history signing complication.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants