Skip to content

feat: add SSE event normalizer module#49

Open
ashwing wants to merge 9 commits into
vllm-project:mainfrom
ashwing:feat/sse-event-normalizer
Open

feat: add SSE event normalizer module#49
ashwing wants to merge 9 commits into
vllm-project:mainfrom
ashwing:feat/sse-event-normalizer

Conversation

@ashwing

@ashwing ashwing commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds events/ module to agentic-core — a pure parsing library that normalizes raw SSE data lines into typed EventFrame structs. This is the foundation for tool dispatch, streaming tee, and loop control (Phases 2–4 of the core API design in PR #44).

  • 20-variant SSEEventType enum covering all Responses API event types (text, function_call, reasoning, file_search, web_search)
  • EventPayload enum with typed extraction per event (no downstream JSON access needed)
  • normalize_sse_line(&str) → Option<EventFrame> — pure function, no state, no async
  • Handles both vLLM (response.done) and OpenAI (response.completed) wire formats
  • #[non_exhaustive] on enums for forward compatibility
  • No dependency on the executor module — lands on main independently of PR feat: agentic-core conversation/responses hydration (ADR-03) #46

Per discussion with @maralbahari on PR #46: this is a separate core module to avoid bloating the accumulator. Once PR #46 merges, a follow-up refactors the accumulator to consume EventFrame instead of inline JSON parsing.

Validated against live vLLM (google/gemma-4-26B-A4B-it, v0.21.0) — cassettes recorded from real streaming responses including function_call tool use.

Test Plan

  • 33 integration tests in tests/event_normalizer_test.rs:
    • Per-event-type parsing (text, function_call, reasoning, response lifecycle, content_part, file/web search)
    • Edge cases: [DONE], empty lines, malformed JSON, unknown events, empty deltas, unicode
    • Full streaming sessions: text-only, function_call, parallel calls, mixed text+tool
    • Real vLLM cassette replay (recorded from live gemma-4-26B-A4B-it)
    • YAML cassette-driven tests (2 cassette files)
  • cargo clippy --workspace --all-targets -- -D warnings — clean
  • cargo fmt --check — clean
  • All 117 workspace tests pass (no regressions)

@ashwing ashwing marked this pull request as draft June 10, 2026 00:26
@ashwing ashwing marked this pull request as ready for review June 10, 2026 01:51
return None;
}

let json: Value = serde_json::from_str(data_str).ok()?;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwing could use deserialize_from_str_opt from utils/common.rs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — fixed, using deserialize_from_str_opt from utils. Will push shortly.

@@ -0,0 +1,37 @@
# Recorded from google/gemma-4-26B-A4B-it via vLLM v0.21.0
# Request: {"model":"google/gemma-4-26B-A4B-it","input":"What is the weather in San Francisco?","tools":[...],"tool_choice":"required","stream":true}
# Date: 2026-06-09

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the cassettes you can record whole response not only the sse. so later after on function_call feature support you can use them directly to verify the model output. what do you think? unless you are planning to test function_call with the multi-turn stateful features then we dont need it here we add new cassettes record for non text output for multi-turns?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense for full-response cassettes when testing the executor end-to-end (your format in PR #46 already handles that well). For this module the input is just the raw data: line — the normalizer has no concept of HTTP headers or request/response envelope. I'll record full cassettes in your format when we add function_call support to the executor integration tests.

@maralbahari maralbahari left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ashwing added 8 commits June 10, 2026 11:41
Introduces `events/` module in agentic-core with:
- SSEEventType enum (20 variants covering all Responses API events)
- EventPayload enum (typed extraction per event type)
- EventFrame struct as the normalized output
- normalize_sse_line() pure function: raw SSE data line → typed frame

Handles both vLLM (response.done) and OpenAI (response.completed)
wire formats. No dependency on the executor module — lands on main
independently of PR vllm-project#46.

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Recorded from google/gemma-4-26B-A4B-it (vLLM v0.21.0). Validates
the normalizer handles vLLM's function_call streaming format where
delta events omit call_id (only present in output_item.done).

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Adds cassette files recorded from live vLLM (gemma-4-26B-A4B-it):
- text-only-vllm-gemma4.yaml: single text turn
- function-call-vllm-gemma4.yaml: tool use with streaming args

Tests load cassettes via serde_yaml (dev-dependency) and validate
normalized output matches expected_text / expected_function_call.

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
- Split ReasoningDelta/ReasoningDone (done event reads "text" not "delta")
- Add name + call_id to OutputItemAdded (vLLM only provides call_id here)
- Make call_id Option<String> in FunctionCallArgsDelta/Done (absent in vLLM)
- Add #[non_exhaustive] to SSEEventType and EventPayload
- Replace deprecated serde_yaml with serde_yml

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Covers previously untested paths:
- ReasoningSummaryTextDelta + ReasoningSummaryTextDone (validates ISSUE-1 fix)
- Parallel function calls (multiple output_index in one response)
- Mixed text + function_call in same response
- call_id recovery from OutputItemAdded
- response.failed and response.incomplete
- Empty delta, unicode in deltas
- File search and web search event classification

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Move inline unit tests from normalize.rs to
tests/event_normalizer_test.rs. All tests use the public API only
(normalize_sse_line) so they don't need module-private access.
Matches the repo convention established by PR vllm-project#46's cassette tests.

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Per review feedback from @maralbahari — use the crate's own
deserialization helper instead of calling serde_json::from_str
directly.

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Reduces repetition in extraction functions — consistent pattern for
accessing JSON fields with default or optional semantics.

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
@ashwing ashwing force-pushed the feat/sse-event-normalizer branch from f56b923 to a016c6f Compare June 10, 2026 18:43
Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants