fix(rtmg/web): bulk sends freeze all reads — chunked sends + write_audio recv hardening by gioelecerati · Pull Request #245 · daydreamlive/DEMON

gioelecerati · 2026-06-11T09:34:26Z

Stacked on the rt-input branch (#235). Server half of the deadlock that made live sequencer splices dead end-to-end — client half is rtmg-vst#33. Both were running as hot-patches on the :dev pod (40431735) during today's debugging with @gioelecerati and validated live (paint → splice → write_audio_applied → ack → audible, ~1.5 s + emergence); this PR makes them durable before the next bake wipes them.

1. Chunked (fragmented) bulk sends

websockets-sync holds protocol_mutex across socket.sendall, and recv_events — the thread that reads every inbound frame — needs that same mutex. One 11 MB stem send froze all reads until the peer drained it; against a VST mid write_audio upload (whose own reads were gated behind its sends — ixwebsocket bug, fixed in rtmg-vst#33) the two sides deadlocked permanently: params dead, splices never received, keepalive killed the session (1011 via the CF tunnel).

Thread dump of the live wedge:

conn_handler: send_stem_payload → socket.sendall holding protocol_mutex
recv_events: blocked acquiring protocol_mutex
keepalive: blocked acquiring protocol_mutex

Fix: stems, the post-swap source mirror, and slice frames go out as fragmented messages in ~256 KiB pieces (chunked_ws_send in audio_codec.py) — the mutex releases between fragments so reads always interleave. Fragmentation is invisible at the message layer; payload bytes are identical (verified with the web SDK and the VST's ixwebsocket).

Note: this half also applies to main — stem delivery freezes reads there too (e.g. against a swap upload in flight) — and is worth cherry-picking independently of rt-input.

2. `write_audio` payload read hardening

The binary payload was read with a bare blocking recv() and no type check:

an orphan write_audio header consumed the next JSON command as its payload (probe-reproduced: audio_write_failed: a bytes-like object is required, not 'str'),
a payload that never arrived blocked the recv loop forever, wedging the whole session.

The read now has a 10 s timeout and a bytes type check; both failure modes answer audio_write_failed and keep the session alive.

Remaining latency levers (not in this PR)

Working end-to-end latency is ~1.5 s transport + 2–5 s emergence. The bar ships as f32 (1.49 MB) — f16/s16/zstd would cut upload 2–10×; #240's near-playhead repatch attacks the emergence delay.

🤖 Generated with Claude Code

…io recv hardening Two server-side halves of the rt-input deadlock (the client halves are rtmg-vst#33), both reproduced and validated live against the :dev pod: 1. websockets-sync holds protocol_mutex across socket.sendall, and recv_events — the thread that reads EVERY inbound frame — needs the same mutex. A single 11 MB stem send therefore froze all reads until the peer drained it; against a VST mid write_audio upload (its own reads gated behind its sends, see rtmg-vst#33) the two sides wedged permanently: params dead, splices never received, keepalive killed the session (1011 via the tunnel). Thread dump of the live wedge: conn_handler in sendall holding protocol_mutex; recv_events and keepalive blocked acquiring it. Big payloads (stems, the post-swap source mirror, slice frames) now go out as fragmented messages in ~256 KiB pieces — the mutex releases between fragments so reads interleave and the cycle cannot form. Fragmentation is invisible at the message layer; payload bytes are identical. 2. write_audio's binary payload was read with a bare blocking recv and no type check: an orphan header consumed the NEXT JSON command as its payload (audio_write_failed: "a bytes-like object is required, not 'str'"), and a payload that never arrived blocked the recv loop forever — wedging the whole session. The read now has a 10 s timeout and a bytes type check; both failure modes answer audio_write_failed and keep the session alive. The chunked-send half also applies to main (stem delivery freezes reads there too, e.g. against a swap upload in flight) and is worth cherry-picking independently of rt-input. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…d reads Review fixes for #245: - The post-swap source mirror in _serialize_swap_ready was still a plain ws.send of a full-length f16 buffer (tens of MB) — the largest single payload on the wire and exactly the read-freezing sendall this PR exists to eliminate. It now goes through chunked_ws_send. - The 10 s timeout + binary type check added for write_audio now covers set_timbre_source, set_structure_source, and the client-upload arm of swap_source via a shared _recv_binary_payload helper — same orphan- header wedge class, same graceful *_failed answer. The not-binary log includes a preview of the consumed frame so a dropped JSON command is traceable. - The control-bus recv thunk accepts (and ignores) the timeout kwarg, so the TypeError fallback around recv_audio(timeout=10) is gone — it fired on every MCP-injected write_audio in production and could mask a genuine TypeError from inside ws.recv. - chunked_ws_send: rename the _chunk param to chunk_size and annotate. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

gioelecerati and others added 2 commits June 11, 2026 11:33

leszko approved these changes Jun 11, 2026

View reviewed changes

leszko merged commit 0c13e3c into ryanontheinside/feat/models/rt-input Jun 11, 2026

leszko deleted the gio/fix/rt-input-ws-read-starvation branch June 11, 2026 10:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(rtmg/web): bulk sends freeze all reads — chunked sends + write_audio recv hardening#245

fix(rtmg/web): bulk sends freeze all reads — chunked sends + write_audio recv hardening#245
leszko merged 2 commits into
ryanontheinside/feat/models/rt-inputfrom
gio/fix/rt-input-ws-read-starvation

gioelecerati commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gioelecerati commented Jun 11, 2026

1. Chunked (fragmented) bulk sends

2. write_audio payload read hardening

Remaining latency levers (not in this PR)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2. `write_audio` payload read hardening