Skip to content

fix(js): stabilize HTML islands when streaming content type switches#196

Open
gadenbuie wants to merge 3 commits intomainfrom
fix/stabilize-html-islands-during-streaming
Open

fix(js): stabilize HTML islands when streaming content type switches#196
gadenbuie wants to merge 3 commits intomainfrom
fix/stabilize-html-islands-during-streaming

Conversation

@gadenbuie
Copy link
Copy Markdown
Collaborator

@gadenbuie gadenbuie commented Apr 14, 2026

Summary

When an agentic turn contains both a raw HTML island (e.g. <btw-run-r-result> wrapped in <shinychat-raw-html>) and assistant markdown text, the streaming message's contentType would flip to "html" when the HTML island is added and then back to "markdown" if more assistant text is streamed. This caused the HTML island to be destroyed and recreated mid-stream — losing scroll position, tooltip state, and any ephemeral element state.

Two root causes, two fixes:

processors.ts — move rehypeHighlight before rehypeRaw

After remarkRehype, raw HTML blocks are still opaque text nodes in the HAST. Running rehypeHighlight here means it only sees code fences from markdown (```r etc.) and never reaches content inside islands. Previously rehypeHighlight ran after rehypeRaw, which parsed islands into HAST elements first — so highlight.js added <span> tags to class="language-r" blocks inside islands. The shinychat-raw-html adapter serializes island children back to HTML via toHtml(), so the changed HAST produced a different html prop and RawHTML's useEffect([html]) reset el.innerHTML, destroying the custom element.

state.ts — lock contentType to the value set at chunk_start

The "chunk" reducer was updating contentType on every chunk, causing the processor to switch from htmlProcessor to markdownProcessor mid-stream. This changed the overall HAST structure enough that React unmounted and remounted the shinychat-raw-html component, causing a second innerHTML reset. Since markdownProcessor already handles raw HTML via rehypeRaw, the chunk_start content type is sufficient for the full stream.

Verification

In a btw shinychat app, ask a question that causes the model to call btw_tool_run_r() and then respond with follow-up text (e.g. "What are 5 random numbers between 1 and 100?"). Before this fix, the <btw-run-r-result> card was destroyed and recreated when the assistant text arrived, often leading to weird parsing when the message would flip back to "markdown". After this fix it remains stable throughout the stream.

When an agentic turn contains both HTML islands (e.g. <btw-run-r-result>
via <shinychat-raw-html>) and assistant markdown text, the streaming
message's contentType flipped from "html" to "markdown" on the first
text chunk. This caused two problems:

1. rehypeHighlight (in markdownProcessor but not htmlProcessor) traversed
   the full HAST tree including nodes inside raw HTML islands. Because
   btw uses class="language-r" on source blocks, highlight.js added <span>
   tags to those blocks. The shinychat-raw-html adapter serializes island
   children back to an HTML string via toHtml(), so the changed HAST
   produced a different string — RawHTML's useEffect([html]) fired and
   reset el.innerHTML, destroying and recreating the custom element.

2. Switching from htmlProcessor to markdownProcessor changed the overall
   HAST structure enough that React unmounted and remounted the
   shinychat-raw-html component, causing a second innerHTML reset.

Fix — two changes:

- processors.ts: move rehypeHighlight before rehypeRaw in markdownProcessor.
  After remarkRehype, raw HTML blocks are still opaque text nodes in the
  HAST. rehypeHighlight running here only sees code fences from markdown
  (```r etc.) and never reaches content inside islands. rehypeRaw then
  parses the raw nodes into HAST elements, after which rehypeHighlight has
  already passed.

- state.ts: stop updating contentType on individual "chunk" actions.
  The processor is now locked to the contentType set at chunk_start for
  the entire stream. markdownProcessor already handles raw HTML via
  rehypeRaw, so the chunk_start contentType is sufficient. This prevents
  the processor from switching mid-stream regardless of per-chunk
  content_type values.
Compiled output of fix(js): stabilize HTML islands when streaming
content type switches.
@gadenbuie gadenbuie marked this pull request as ready for review April 14, 2026 14:23
…art behavior

The reducer no longer updates contentType mid-stream, so update the test
description and assertion to document the new invariant.
@gadenbuie
Copy link
Copy Markdown
Collaborator Author

gadenbuie commented Apr 14, 2026

I asked Claude, in a fresh session, to re-evaluate the choice to disallow changing contentType mid-stream:


Why locking contentType at chunk_start is safe

The state.ts change drops mid-stream contentType updates — a test was updated to reflect this (ignores contentType provided by a chunk (locked to chunk_start)). Here's the reasoning for why removing this capability is safe and correct.

The original mid-stream switching pattern was: a response begins as "markdown", then a chunk arrives with content_type: "html" (signaling a UI element appeared), and the reducer switched from markdownProcessor to htmlProcessor. The intent was to let the server "upgrade" the renderer on-the-fly when HTML islands appeared in the stream.

This was unnecessary because markdownProcessor is a strict superset of htmlProcessor. It runs remarkRehype (markdown → HAST) plus rehypeRaw (parses raw HTML text nodes into HAST elements). Any content htmlProcessor handles, markdownProcessor handles too — so there is never a reason to switch away from it.

The mid-stream switch was actually harmful in two ways:

  1. rehypeHighlight traversed island content. In the old pipeline order (remarkRehype → rehypeRaw → rehypeHighlight), highlight.js ran after rehypeRaw had already parsed raw HTML blocks into HAST nodes, so it could see inside <shinychat-raw-html> islands. btw uses class="language-r" on source blocks inside those islands, so highlight.js added <span> tags to them. The shinychat-raw-html adapter serializes island children back to an HTML string via toHtml(), so the mutated HAST produced a different string → RawHTML's useEffect([html]) fired → el.innerHTML reset → custom element destroyed and recreated. Fixed by moving rehypeHighlight before rehypeRaw.

  2. Processor switching caused React unmount/remount. Even with the highlighting fixed, switching from htmlProcessor to markdownProcessor mid-stream changes the overall HAST structure (different wrapping, different node types), which React interprets as a component tree change and unmounts/remounts shinychat-raw-html — a second innerHTML reset. Fixed by locking the processor to chunk_start.

The chunk_start contentType is therefore sufficient for the entire stream.

@gadenbuie gadenbuie requested a review from cpsievert April 14, 2026 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant