Skip to content

fix(streaming): suppress trailing empty STOP chunks with zero parts in SSE streaming#426

Open
kalenkevich wants to merge 1 commit into
mainfrom
fix/empty-text-on-stream-mode
Open

fix(streaming): suppress trailing empty STOP chunks with zero parts in SSE streaming#426
kalenkevich wants to merge 1 commit into
mainfrom
fix/empty-text-on-stream-mode

Conversation

@kalenkevich

Copy link
Copy Markdown
Collaborator

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

2. Or, if no issue exists, describe the change:

Problem:
In StreamingMode.SSE (Server-Sent Events streaming mode), two bugs were observed when using Gemini models (e.g., gemini-3.5-flash):

  1. Extra Empty Bubbles in UI: Under every successful bot response, a blank/empty bubble was rendered.
  2. Premature Agent Loop Termination: After a tool call successfully completed, the agent failed to call the model again to deliver the final response.

Both symptoms share the same root cause:

  • At the end of every successful LLM stream, the model sends a final empty chunk containing finishReason: STOP and no content (e.g., parts.length === 0).
  • If this empty chunk is not suppressed, createLlmResponse converts it to an event with errorCode: FinishReason.STOP and content: undefined.
  • The runner yields this empty event to the client with a new event ID, causing the UI to render a blank response bubble.
  • Furthermore, when this empty event is yielded right after a tool call, isFinalResponse(event) returns true (since the event is empty, non-partial, and has no function calls), which prematurely terminates the runner's execution loop (runAsyncImpl in LlmAgent), preventing the agent from calling the model with the tool results.

Solution:
We updated StreamingResponseAggregator.processResponse in streaming_utils.ts to comprehensively suppress all non-error empty chunks (chunks with no meaningful content and a finish reason of undefined or STOP).

  • This prevents yielding empty events that cause empty bubbles in the UI, and prevents premature agent termination after tool calls.
  • Metadata (e.g. usage, grounding, citation) is still successfully saved on early return and delivered with partial: false in the aggregator's final consolidated response from close().
  • Error streams (e.g. safety blocks with finishReason: SAFETY) are not suppressed, ensuring errors are still correctly reported to the user.

Testing Plan

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Summary of passed npm test results:

 Test Files  172 passed | 15 skipped (187)
      Tests  1872 passed | 37 skipped (1909)
   Start at  16:02:37
   Duration  48.73s

We added the following unit tests in streaming_utils_test.ts:

  1. should capture metadata on trailing empty chunk with zero parts early return: Verifies progressive streaming mode empty STOP chunk suppression and metadata preservation.
  2. should suppress trailing empty STOP chunk with zero parts after a function call in non-progressive mode: Verifies tool call stream trailing empty chunk suppression in non-progressive mode.
  3. should suppress trailing empty STOP chunk for normal text streams in non-progressive mode: Verifies normal text stream trailing empty chunk suppression.
  4. should NOT suppress trailing empty chunk with non-STOP finish reason in non-progressive mode: Verifies that error finish reasons (e.g. SAFETY) are NOT suppressed and are correctly propagated.

Manual End-to-End (E2E) Tests:

Manually tested using the weather_time_agent sample in the dev package.

  1. Run the local dev server using npx adk web ./samples in the dev directory.
  2. Under streaming mode, ask the agent "Hello, what can you do?" and "weather in New York".
  3. Verify that:
    • No blank/empty response bubbles are rendered after any agent messages.
    • After the get_weather tool call finishes, the agent successfully receives the tool results and continues the execution loop to deliver the final response: "The weather in New York is currently sunny and 25°C (77°F)."

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

Additional context

@AmaadMartin AmaadMartin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I performed a review of the changes:

  1. Metadata Loss Bug in Non-Progressive Mode (Pre-existing but aggravated by PR changes):
    In StreamingResponseAggregator.close(), if finalParts is undefined, the method returns undefined immediately.
    When the model only returns a function call (and no text parts) in non-progressive mode, the function call is yielded immediately in processResponse. The trailing empty STOP chunk containing the final usageMetadata is suppressed, and its metadata is saved to this.usageMetadata. When the stream finishes and close() is called, finalParts is undefined (since no text was accumulated), so close() returns undefined and the saved usageMetadata is completely lost.

    Proposed Fix:
    Allow close() to return an LlmResponse with an empty parts array (parts: []) if any metadata (usageMetadata, groundingMetadata, or citationMetadata) is present:

    close(): LlmResponse | undefined {
      const finalParts = this.strategy.close();
      const hasMetadata =
        this.usageMetadata !== undefined ||
        this.groundingMetadata !== undefined ||
        this.citationMetadata !== undefined;
    
      if (!finalParts && !hasMetadata) {
        return undefined;
      }
    
      const candidate = this.response?.candidates?.[0];
      const finishReason = this.finishReason ?? candidate?.finishReason;
    
      return {
        content: {
          role: 'model',
          parts: finalParts ?? [],
        },
        groundingMetadata: this.groundingMetadata,
        citationMetadata: this.citationMetadata,
        errorCode: finishReason === FinishReason.STOP ? undefined : finishReason,
        errorMessage:
          finishReason === FinishReason.STOP
            ? undefined
            : candidate?.finishMessage,
        usageMetadata: this.usageMetadata,
        finishReason: finishReason,
        partial: false,
      };
    }

    Note: This bug existed before this PR, but since this PR changes the suppression logic, it is a great place to fix it.

  2. Code Clean-up in streaming_utils_test.ts:
    In the new tests added, GenerateContentResponse is created manually:

    const response2 = new GenerateContentResponse();
    response2.candidates = [{ finishReason: FinishReason.STOP }];

    We should use the existing helper function createResponse(...) defined at the top of the test file instead, to keep the test code clean and consistent:

    const response2 = createResponse({ finishReason: FinishReason.STOP });

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gemini 3.x empty text on stream mode, no response from tool call

3 participants