streaming: accept optional zstd/float16-compressed source frames#225
Draft
gioelecerati wants to merge 1 commit into
Draft
streaming: accept optional zstd/float16-compressed source frames#225gioelecerati wants to merge 1 commit into
gioelecerati wants to merge 1 commit into
Conversation
The binary source decoder (_decode_audio_msg) only accepted raw float32 PCM. Add an optional, backward-compatible compact framing: a 0xFFFFFFFF sentinel (never a valid channel count) + a flags byte selecting zstd compression and/or float16 samples. Legacy float32 frames are unchanged. The source is only a conditioning reference (it gets encoded to a latent), so float16 + zstd shrink uploads substantially with no audible cost - especially for repetitive/looped material. The accepted encodings are advertised in the ready payload (src_encodings) so clients can gate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Make the streaming binary source decoder (
_decode_audio_msg) accept anoptional, backward-compatible compact frame alongside the existing
raw-float32 frame.
Compact frame layout:
The legacy frame (
<II channels,samples>+ float32) is unchanged — thechannel count is always 1/2, so it can never collide with the sentinel.
Why
The uploaded source is only a conditioning reference (it's encoded straight
to a latent), so its sample precision is irrelevant. float16 halves it; zstd
then crushes repetitive/looped material. Net: much smaller source uploads,
no audible change, faster connect.
Compatibility
ready.src_encodingsso clients canfeature-gate (and fall back to float32 against older servers).
swap_source/ timbre / structure all benefit.Draft — opening for review of the wire format.