Skip to content

POC: MessageMaterializer + partial JSON for structured streaming#2182

Draft
mattheworiordan wants to merge 2 commits intomainfrom
poc/message-materializer
Draft

POC: MessageMaterializer + partial JSON for structured streaming#2182
mattheworiordan wants to merge 2 commits intomainfrom
poc/message-materializer

Conversation

@mattheworiordan
Copy link
Member

Summary

Proof of concept exploring two SDK-level conveniences motivated by real customer asks around message.append and structured data streaming. This is a POC for discussion, not a proposed API - intended to inform wider AIT thinking about what the SDK should offer.

Background and motivation

Structured data streaming (the Expedia ask)

Expedia asked how to handle updates for things that aren't plain text - specifically, structured JSON objects streamed over message.append. Today we don't have a good answer for this.

The pattern that emerged: if you structure your JSON so fixed keys come first and the streaming text content comes last, a partial JSON parser can render a valid object at every step:

// Mid-stream - the content field is still being appended to:
{
  "model": "claude-opus-4-20250514",
  "usage": { "input_tokens": 150 },
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "The quick brown fox jumps over"
    }
  }]
}

This is a viable pattern for any structured streaming use case (AI/LLM responses, document updates, form data) and toPartialJSON() makes it trivial for developers.

Message materialization (everyone's problem)

While exploring the structured streaming question, the other half became obvious: every developer using message.append writes the same boilerplate. Track serials, accumulate data, handle late-join via getMessage(), manage cache eviction. This has come up repeatedly - see the real-world accumulation discussion in #ait-sdk where the team ran into exactly this building with append.

The MessageMaterializer handles all of it. Similar to how annotations emit the full summary so users don't apply incremental updates themselves, the materializer emits the full materialized message at each step.

Related issues

What the POC demonstrates

Before (what every developer writes today)

const messages = new Map();
channel.subscribe((msg) => {
  if (msg.action === 'message.create') {
    messages.set(msg.serial, msg.data);
  } else if (msg.action === 'message.append') {
    const existing = messages.get(msg.serial);
    if (existing) {
      messages.set(msg.serial, existing + msg.data);
    } else {
      // Late join: fetch via getMessage(), queue appends,
      // apply version watermarks... gets complex fast
    }
  }
});

After (with MessageMaterializer)

const materializer = new MessageMaterializer(channel);

materializer.subscribe((msg) => {
  console.log(msg.data);            // full accumulated string
  console.log(msg.toPartialJSON()); // parsed structured data, even mid-stream
});

What's included

Component Description
MessageMaterializer Subscribes to channel, accumulates appends, handles late-join via getMessage(), manages cache eviction
toPartialJSON() Parses incomplete JSON mid-stream into valid partial objects
partial-json parser Vendored (~280 lines, MIT) - recursive descent parser for incomplete JSON
Tests 27 passing (14 parser unit + 13 integration against Ably sandbox)
Demo npx tsx examples/materializer-demo.ts - shows progressive streaming, late-join catch-up
Build esbuild plugin config following LiveObjects pattern

Caveats

  • POC only - not production-ready, not a proposed API surface
  • partial-json parser is vendored inline; dependency decisions TBD if this progresses
  • Requires mutableMessages enabled on the channel namespace
  • The two features (materializer + partial JSON) are coupled here because they address the same use case, but could be separated

Discussion

This POC is meant to prompt conversation about:

  1. Should the SDK handle message materialization natively (like it does for annotation summaries)?
  2. Is structured JSON streaming a pattern we should explicitly support and document?
  3. How does this fit into wider AIT SDK thinking around the developer experience for append-based use cases?

🤖 Generated with Claude Code

Proof of concept for a convenience layer over message.append that
automatically accumulates appended data, handles late-join via
getMessage(), and provides toPartialJSON() for rendering incomplete
JSON during AI/LLM token streaming.

Includes vendored partial-json parser, integration tests (27 passing),
esbuild config, and a runnable demo script.
@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f6fd9b6f-1ab9-4dd4-89af-cf0f1366ddb9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch poc/message-materializer
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can validate your CodeRabbit configuration file in your editor.

If your editor has YAML language server, you can enable auto-completion and validation by adding # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json at the top of your CodeRabbit configuration file.

@github-actions github-actions bot temporarily deployed to staging/pull/2182/features March 13, 2026 19:17 Inactive
Rewrote the vendored partial-json parser to match the upstream
(promplate/partial-json-parser-js) approach: delegate string/number
parsing to JSON.parse, use closures instead of a class, and share
helpers for keyword matching and error throwing.

Added 20 new tests matching upstream test cases (Allow flags,
edge cases, error paths). Total: 47 tests, all passing.

Size reduction (minified): 6,357 → 1,488 bytes (77% smaller).

---

Here are the optimization results across 5 parallel iterations:

┌────────────┬────────────────────────────────────────────────────────────────────┬────────────────┬───────────┐
│ Iteration  │                              Strategy                              │ Minified bytes │ Reduction │
├────────────┼────────────────────────────────────────────────────────────────────┼────────────────┼───────────┤
│ Baseline   │ Original class-based parser                                        │ 6,357          │ -         │
├────────────┼────────────────────────────────────────────────────────────────────┼────────────────┼───────────┤
│ 1          │ Upstream approach (closures + JSON.parse delegation)               │ 2,080          │ 67%       │
├────────────┼────────────────────────────────────────────────────────────────────┼────────────────┼───────────┤
│ 2          │ Merged container parsing + shared error helper                     │ 1,601          │ 75%       │
├────────────┼────────────────────────────────────────────────────────────────────┼────────────────┼───────────┤
│ 3 (winner) │ Single recursive parse fn + fail()/J aliases + s[i]<'!' whitespace │ 1,488          │ 77%       │
├────────────┼────────────────────────────────────────────────────────────────────┼────────────────┼───────────┤
│ 4          │ Lookup table for keywords + guard helper                           │ 1,628          │ 74%       │
├────────────┼────────────────────────────────────────────────────────────────────┼────────────────┼───────────┤
│ 5          │ Micro-optimizations (slice, charCodeAt, cached JSON.parse)         │ 2,000          │ 69%       │
└────────────┴────────────────────────────────────────────────────────────────────┴────────────────┴───────────┘

Key techniques in the winning version:
- JSON.parse for string escape handling and number parsing (instead of manual character scanning)
- Closures over local i/s/len instead of a class with this.pos/this.str
- fail() and J aliases deduplicate throw Error and JSON.parse references
- s[i] < '!' for whitespace (all whitespace chars are below ! in ASCII)
- lit() helper handles all 6 keyword types (null/true/false/NaN/Infinity) in ~3 lines

Source went from 422 lines → 105 lines. Gzipped from 1,347 → 743 bytes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant