v1.16.2 — T33 TRIM: ask_agentic loop streaming + stallMs heartbeat watchdog by qmt · Pull Request #58 · qmediat/gemini-code-context-mcp

qmt · 2026-04-30T21:28:29Z

Summary

ask_agentic loop iterations now use generateContentStream + collectStream (closes T33 — the last open item in the v1.6/v1.7 streaming refactor track).
New per-call stallMs schema field (1s–10min) + GEMINI_CODE_CONTEXT_AGENTIC_STALL_MS env var. Heartbeat-aware watchdog that resets on every chunk; fires only when the stream goes silent. Does NOT fire while the model is actively thinking.
TIMEOUT errorResult surfaces timeoutKind: 'stall' | 'total' so wrappers can route retry policy. Mirrors ask / code v1.12.0 conventions.
Live thinking: … progress events during the 30-120s thinking phase. UX parity with ask/code.

Scope-exclusion: rescue stays non-streaming

The post-loop forced-finalization rescue intentionally keeps generateContent (not streaming). After 2-of-2 cross-reviewer consultation (gcc-ask: GO full scope; grok: TRIM), TRIM was adopted because:

The rescue's apiCallSucceeded contract was stabilized in v1.15.0 (P3+P4+P5) through 4 review rounds + /6step cross-corroboration — re-opening that recently-stabilized surface for marginal liveness gain on a once-per-call path is the wrong trade.
Mid-stream failures cannot be retried (Gemini's generateContentStream has no resume) — the v1.6/v1.7 plan §11 explicitly cites this. The rescue's transient-failure budget routing depends on knowing whether the API was billed; partial-stream-then-fail blurs that signal.
Empirical wall-clock dominance is in the LOOP, not the rescue. Streaming where the wall-clock lives gets 80-90% of the value at half the effort and zero risk to recently-stabilized code.
Rescue cost is ONE call. Stalls there are bounded by iterationTimeoutMs.

docs/FOLLOW-UP-PRS.md T33 entry updated with ACCEPTED-DEFERRED rationale + revisit criteria.

Backward compatibility

No breaking change. stallMs is an optional new schema field. Existing callers that don't set it see identical behavior to v1.16.1.
structuredContent.timeoutMs now reflects the LIMIT THAT FIRED (mirrors ask/code). Pre-v1.16.2 only iterationTimeoutMs could fire, so that field always held the iterationTimeoutMs value. Wrappers that branch on timeoutKind get unambiguous routing; wrappers that read only timeoutMs see the active cap.
Rescue path completely unchanged. Operators who hit the rescue see byte-identical behavior to v1.16.1 (convergenceForced, rescueErrorCode, etc.).

Test plan

🤖 Generated with Claude Code

…tchdog Loop iterations now use generateContentStream + collectStream instead of non-streaming generateContent. New per-call `stallMs` schema field + `GEMINI_CODE_CONTEXT_AGENTIC_STALL_MS` env var. TIMEOUT errorResult surfaces `timeoutKind: 'stall' | 'total'` so wrappers can route retry policy. Rescue at the post-loop forced-finalization site is intentionally NOT migrated — the v1.15.0 P3+P4+P5 `apiCallSucceeded` contract was stabilized through 4 review rounds + /6step cross-corroboration; mid-stream failures cannot be retried (Gemini has no resume), so re-opening that surface for a once-per-call path is the wrong trade. 3 new tests + buildCtx mock factory updated to support generateContentStream. 762 tests pass (759 → 762). Lint + typecheck clean. Closes T33 — last open item in the v1.6/v1.7 streaming refactor track. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ode parity Round-1 cross-review (gemini-code-context + grok + codex). Codex + grok clean; gemini flagged two empirically verifiable issues. Both folded: Finding #1 (HIGH) — collectStream candidate preservation src/tools/shared/stream-collector.ts:174 — pre-fix gate was outer-array length only. If Gemini's stream protocol ever emits a final terminator shape `{ candidates: [{ content: { parts: [] }, finishReason: 'STOP' }] }`, the previous chunk's functionCall Part would be silently overwritten — ask_agentic's loop would treat the iteration as final-text and never dispatch the tool. Defensive fix: gate also requires non-empty parts. Empirical current-protocol behaviour (functionCall + finishReason packed into one chunk) preserved either way; this hardens against fragmentation patterns thinkingLevel=HIGH or future SDK versions could legitimately introduce. Finding #2 (MEDIUM) — timeoutMs payload contract divergence from ask/code src/tools/ask-agentic.tool.ts — initial commit collapsed timeoutMs to "the limit that fired". ask.tool.ts:1132 always reports the configured total wall-clock cap regardless of which watchdog fired — discriminator is timeoutKind. Reverted to ask/code parity. Two new regression tests: - test/unit/stream-collector.test.ts — fragmentation-guard pin at the collector level - test/unit/ask-agentic.test.ts — fragmentation-guard pin at the agentic-loop level (verifies read_file dispatches + organic final-text) 764 tests pass (762 → 764). Lint + typecheck + build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR completes the ask_agentic loop’s migration to streaming, adding a heartbeat-aware stall watchdog (stallMs) and live “thinking” progress events to match ask/code behavior, while keeping the post-loop rescue path non-streaming.

Changes:

Switch ask_agentic loop iterations to generateContentStream + collectStream, wiring chunk heartbeats to the stall watchdog and emitting live thought previews.
Harden collectStream so terminator-only chunks with empty parts don’t overwrite earlier functionCall candidates (fragmentation guard).
Add/extend unit tests for stall-vs-total timeout behavior and fragmented functionCall dispatch; bump version/docs/changelog for v1.16.2.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/tools/ask-agentic.tool.ts`	Streams loop iterations, adds `stallMs` input + env fallback, surfaces `timeoutKind`, emits live thought progress.
`src/tools/shared/stream-collector.ts`	Prevents empty-`parts` terminator candidates from overwriting earlier candidates (preserves functionCalls).
`test/unit/ask-agentic.test.ts`	Adds stall watchdog + timeoutKind regression tests and a fragmented functionCall dispatch test.
`test/unit/stream-collector.test.ts`	Adds a collector-level regression test for the empty-`parts` terminator overwrite case.
`docs/FOLLOW-UP-PRS.md`	Marks T33 as shipped and documents TRIM scope/rationale.
`CHANGELOG.md`	Adds v1.16.2 release notes for streaming + stall watchdog changes.
`package.json`	Bumps package version to `1.16.2`.
`server.json`	Bumps server/package version references to `1.16.2`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+
+- **`stallMs` per-call schema field** (1s–10min, optional) — heartbeat-aware stall watchdog that resets on every chunk yielded by `generateContentStream` (text or thought). Fires ONLY when the stream goes silent for this long. Does NOT fire while the model is actively thinking. Documented as the right knob for "kill dead sockets quickly without penalising deep reasoning."
+- **`GEMINI_CODE_CONTEXT_AGENTIC_STALL_MS` env var** — fallback path identical to `ask`/`code` (`GEMINI_CODE_CONTEXT_ASK_STALL_MS`, `GEMINI_CODE_CONTEXT_CODE_STALL_MS`). Recommended: `60_000` (60s).
+- **`structuredContent.timeoutKind`** on `errorCode: 'TIMEOUT'` — `'stall' | 'total'` discriminates which watchdog fired. Surfaces the active limit in `timeoutMs` and the configured stall in `stallMs` (mirrors the ask/code pattern). Wrappers can apply different retry policies — `'stall'` is usually safe to retry (dead socket); `'total'` means raise `iterationTimeoutMs` or narrow the prompt.


 - Existing `iterationTimeoutMs` stays orthogonal (wall-clock cap; both can be set, whichever fires first wins).
- Update `createTimeoutController` call site to pass composite `{ totalMs, stallMs }` opts (already supported since v1.12.0; just call with the new field).
+- `createTimeoutController` loop call site now passes composite `{ totalMs, stallMs }` opts.
+- TIMEOUT errorResult now surfaces `timeoutKind: 'stall' | 'total'` + both `timeoutMs` (active limit that fired) and `stallMs` (configured stall watchdog) — mirrors ask/code conventions.


… timeoutMs semantics Copilot Round-1 review on PR #58 flagged 2 stale wording residues from the Round-1 fold of gemini Finding #2. Both doc-only: - CHANGELOG.md "What's new" bullet (line 24) — claimed `timeoutMs` surfaces "the active limit", contradicting the actual implementation (always-configured-total mirroring ask.tool.ts:1132). - docs/FOLLOW-UP-PRS.md T33 entry (line 577) — same stale "active limit that fired" wording. Both reworded to match the implementation: `timeoutMs` reports the configured TOTAL wall-clock cap (or null when disabled); `stallMs` reports the configured stall watchdog (or null when disabled); `timeoutKind` is the discriminator. Both findings are TRUE POSITIVE LOW per /6step. Empirical verification: ask.tool.ts:1132 = `timeoutMs: ms` (always configured total). Implementation in ask-agentic.tool.ts post-Round-1-fold mirrors verbatim. Doc text was the only residue left from the initial commit's collapsed-to-active-limit shape. No code or test changes — pure doc-correctness fold. 764 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

qmt and others added 2 commits April 30, 2026 23:27

qmt requested a review from Copilot April 30, 2026 21:45

Copilot started reviewing on behalf of qmt April 30, 2026 21:46 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

qmt merged commit 371bd8f into main Apr 30, 2026
4 checks passed

qmt deleted the v1.16.2-t33-trim-loop-streaming branch April 30, 2026 21:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.16.2 — T33 TRIM: ask_agentic loop streaming + stallMs heartbeat watchdog#58

v1.16.2 — T33 TRIM: ask_agentic loop streaming + stallMs heartbeat watchdog#58
qmt merged 3 commits into
mainfrom
v1.16.2-t33-trim-loop-streaming

qmt commented Apr 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

qmt commented Apr 30, 2026

Summary

Scope-exclusion: rescue stays non-streaming

Backward compatibility

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants