Live browser demo:
https://wavey.ai/code/encodec-rs/browser-smoke/
encodec-rs is a Rust EnCodec runtime with native and browser .ecdc
encode/decode paths.
Native execution is implemented in Rust on top of ONNX Runtime and has no
Python runtime dependency. It does not require a Python bridge or external codec
subprocess. The browser path runs the EnCodec ONNX frame models with
onnxruntime-web and uses Rust wasm for .ecdc packaging, parsing,
overlap-add, and deterministic LM arithmetic coding. It also has no Python
runtime dependency.
The native path loads EnCodec-compatible ONNX bundles, encodes 48 kHz stereo
WAV to .ecdc, decodes .ecdc back to WAV, and supports CPU, CUDA, CoreML,
and TensorRT execution targets. LM-assisted entropy coding is implemented in
Rust.
The browser path supports the current q8 LM .ecdc bitstream (acv=2):
- encode a full audio file in the browser with
encode_frame.onnx - package q8 LM arithmetic-coded chunks with Rust wasm
- decode q8
.ecdcpayloads withdecode_frame.onnx - overlap-add decoded frames in Rust wasm
- run ONNX frame models through WebGPU, with WASM available for unsupported nodes
Build the wasm package:
rustup target add wasm32-unknown-unknown
cargo check --lib --no-default-features --features wasm --target wasm32-unknown-unknown
cargo install wasm-pack
wasm-pack build --target web --no-default-features --features wasmRun the local browser encode/decode/playback page:
npm install --prefix browser-smoke
python3 browser-smoke/serve.pyThen open:
http://127.0.0.1:8787/browser-smoke/
The scripted WebGPU matrix runner is:
node scripts/webgpu-matrix.mjsIt writes browser WebGPU artifacts under target/webgpu-matrix/. See
MATRIX.md for the current full-track matrix output folders.
scripts/westside-chunk-wasm-roundtrip.mjs exercises the full wasm
encode/decode path on the Lori Asha - Westside track in independent
fixed-chunk mode. It uses only the exported wasm helpers (no native runtime):
- reads the source WAV from
target/lori-asha-wasm-native/wav/02 - Lori Asha - Westside.48k-stereo.wav - splits it, soundkit-style, into non-overlapping
1.333sPCM chunks (one chunk perencodec_48khz_12kbps_1333msowned hop,64,000samples) - wasm-encodes each chunk to its own standalone
.ecdcintestdata/out/ecdc/ - wasm-decodes each
.ecdc(read back from disk) to PCM intestdata/out/pcm/ - concatenates the per-chunk PCM into one contiguous
testdata/out/westside.contiguous.wav
# full track (~4.5 min, 158 chunks)
node scripts/westside-chunk-wasm-roundtrip.mjs
# quick smoke test over the first N chunks
WESTSIDE_MAX_CHUNKS=3 node scripts/westside-chunk-wasm-roundtrip.mjsChatty progress is written to stderr; a JSON summary is written to stdout
(node scripts/westside-chunk-wasm-roundtrip.mjs 2>/dev/null to keep only the
summary). Each chunk re-uses lmEcdcFixedHeaderForWeights, so the chunk size is
the bundle's 63,520-frame non-overlapping stride (~1.323s); this tiles the
track gaplessly and reconstructs every source frame.
Because wasm-bindgen --target web does not emit a package.json, Node treats
the generated pkg/encodec_rs.js as CommonJS. If the import fails, add the ESM
marker to the gitignored build output:
echo '{ "type": "module" }' > pkg/package.jsonSafari requires Safari 26 or newer for WebGPU, or Safari Technology Preview
with the WebGPU feature enabled. Apple Silicon hardware is not enough by itself;
the browser must expose navigator.gpu to the page. In Safari, enable
Show features for web developers, then open Develop > Feature Flags, search
for WebGPU, and enable it. If present, also enable GPU Process: DOM Rendering
and GPU Process: Canvas Rendering, then quit and reopen Safari.
The exported wasm helpers used by the q8 matrix path are:
ecdcMetadata(payload)ecdcOverlapAdd(bundleJson, audioLength, decodedFrames)lmEcdcHeaderForWeights(bundleJson, audioLength, 2, weights)lmEcdcFixedHeaderForWeights(bundleJson, audioLength, 2, weights)lmEcdcChunk(payload)lmEcdcDecodeChunks(bundleJson, payload)QuantizedLmChunkEncoderQuantizedLmChunkDecoderstableHashHex(bytes)
Use lmEcdcHeaderForWeights for dynamic bundles. Use
lmEcdcFixedHeaderForWeights when writing ECDC against a fixed-length ONNX
graph; it records the fixed chunk samples, stride, and LM frame length (fl) so
decoders pull the full graph width for every chunk, including the final chunk.
For fixed graph chunks, finish LM packet encoding with
QuantizedLmChunkEncoder.finishPadded(frameLength) so encodec-rs writes zero-code
padding for any short final segment before the ECDC packet is wrapped.
Model bundles are hosted on Hugging Face:
Download them into the checkout before running ONNX/browser model paths:
scripts/download-onnx-bundles.shThe hosted bundles target the 48 kHz stereo model family:
onnx-bundles/encodec_48khz_6kbpsonnx-bundles/encodec_48khz_12kbps
Both bundles include:
encode_frame.onnxdecode_frame.onnxlm_weights_q8.binbundle.json
So LM-assisted .ecdc compression works after the bundle download step.
Native and browser LM entropy coding use the q8 Rust/wasm LM backend. Older raw and f32/ONNX-LM bitstreams are intentionally not supported.
The dynamic bundles are the default native bundles. Their frame models accept a variable final frame, so ECDC can derive each chunk's LM frame length from the actual sample count:
| Bundle | Bandwidth | Nominal chunk | Samples | Stride | LM frames | Codebooks |
|---|---|---|---|---|---|---|
encodec_48khz_6kbps |
6 kbps | 1000ms | 48,000 | 47,520 | 150 | 4 |
encodec_48khz_12kbps |
12 kbps | 1000ms | 48,000 | 47,520 | 150 | 8 |
Fixed bundles trace the ONNX graph at one chunk size. ECDC written for these
bundles should include cs, cst, and fl, and should entropy-code the full
fl steps. The PCM input segment is already zero-padded before EnCodec encode;
the ECDC writer must not shorten the LM stream for the final partial chunk.
Fixed bundles are guarded: each logical chunk is encoded with ±10 ms (480
samples) of real neighbouring source context on each side. The model window is
owned + 2 × 480; the guard samples are codec context only and are cropped
after decode, leaving the exact owned-sample timeline. Adjacent decoded chunks
are then joined with a deterministic cubic-hermite-v1 0.5 ms (24-sample) seam
repair (see chunk-continuity.md).
| Fixed chunk | Owned | Model window | LM frames | Bundle suffix |
|---|---|---|---|---|
| 1333ms | 64,000 | 64,960 | 203 | _1333ms |
| 1800ms | 86,400 | 87,360 | 273 | _1800ms |
The default wasm fixed-bundle package ships the 1333ms and
1800ms variants for both 6 kbps and 12 kbps.
- Pure Rust
.ecdccontainer logic - Pure Rust arithmetic coding
- Pure Rust deterministic LM-driven entropy path
- No Python bridge
- No external codec subprocess
The only non-Rust runtime dependency is ONNX Runtime for the neural frame encoder/decoder.
The .ecdc layer is now model-runtime agnostic. Build it without ONNX Runtime:
cargo check --features ecdcNative callers can keep the Rust bitstream path and provide only the neural frame runtime:
ecdc::FrameCodec: metadata plusencode_frame/decode_frameecdc::LmCodec: LM logits for portable arithmetic-coded chunksportable_lm::PortableLmCodec: loadsbundle.json+lm_weights_q8.binwithout ONNX Runtime
The ONNX runtime implements those traits through OnnxFrameCodec
and OnnxLmCodec, so existing CLI/browser parity remains the validation
harness. For iOS/macOS product code, the intended final shape is a Swift/MLX
frame backend, with Core ML or ONNX Runtime used only as transitional parity
checks.
Apple MLX support now lives in this repository under apple/. The Swift package
loads MLX Swift .safetensors archives for frame encode_frame /
decode_frame, while the Rust crate owns .ecdc, portable q8 LM coding, and
the C ABI bridge in src/mlx_bridge.rs. See apple/README.md for Swift package
build, test, and Westside benchmark commands.
After downloading the bundles, convert them with:
target/quant-venv/bin/python scripts/export-mlx-frame-archive.py \
onnx-bundles/encodec_48khz_6kbps \
target/mlx-bundles/encodec_48khz_6kbps
target/quant-venv/bin/python scripts/export-mlx-frame-archive.py \
onnx-bundles/encodec_48khz_12kbps \
target/mlx-bundles/encodec_48khz_12kbps
scripts/create_mlx_fixed_bundles.shEach MLX bundle contains bundle.json, lm_weights_q8.bin,
encode_frame.safetensors, decode_frame.safetensors, and
mlx-manifest.json. The Python step is offline conversion tooling only; the
native app path is Swift/MLX plus the Rust .ecdc/portable-LM boundary.
The fixed-bundle helper exports from the fixed ONNX bundles, so the standard
1333ms and 1800ms MLX bundles use the same 300-step q8 LM weights as ONNX. It
does not create application-specific compatibility bundles.
cargo build --release --features onnxRun tests:
cargo test --features onnxInspect a bundle:
encodec-rs onnx-inspect onnx-bundles/encodec_48khz_6kbpsSmoke-test model execution:
encodec-rs onnx-smoke onnx-bundles/encodec_48khz_6kbpsEncode WAV to .ecdc:
encodec-rs onnx-encode \
onnx-bundles/encodec_48khz_6kbps \
input.wav \
output.ecdcDecode .ecdc to WAV:
encodec-rs onnx-decode \
onnx-bundles/encodec_48khz_6kbps \
input.ecdc \
output.wavDirect frame roundtrip without .ecdc:
encodec-rs onnx-roundtrip-wav \
onnx-bundles/encodec_48khz_6kbps \
input.wav \
output.wavCPU is the default.
Use CUDA:
encodec-rs onnx-encode \
onnx-bundles/encodec_48khz_6kbps \
input.wav \
output.ecdc \
--cudaSelect a GPU explicitly:
encodec-rs onnx-encode \
onnx-bundles/encodec_48khz_6kbps \
input.wav \
output.ecdc \
--cuda \
--device-id 0Use TensorRT:
encodec-rs onnx-encode \
onnx-bundles/encodec_48khz_6kbps \
input.wav \
output.ecdc \
--tensorrt \
--fp16Use CoreML on Apple Silicon:
encodec-rs onnx-encode \
onnx-bundles/encodec_48khz_6kbps \
input.wav \
output.ecdc \
--coreml \
--coreml-compute-units cpu-and-gpuCoreML caches compiled model artifacts under bundle_dir/.coreml-cache/ by
default. Override that with --coreml-cache-dir if needed.
LM chunk payloads are CRC-wrapped by default. The CRC is stored next to each length-prefixed chunk and lets decoders identify corrupted recovered chunks before arithmetic decoding.
Adjust frame batching:
encodec-rs onnx-encode \
onnx-bundles/encodec_48khz_6kbps \
input.wav \
output.ecdc \
--batch-size 16onnx-encodecurrently expects WAV input- input sample rate must match the bundle sample rate
- the hosted bundles are for
48 kHzstereo audio - CLI resampling is not implemented yet
If your source is not already 48 kHz stereo WAV, normalize it first.
encodec-rs writes only the minimal metadata needed to decode the payload:
- model name
- audio length
- codebook count
- LM / arithmetic settings
- q8 bitstream version (
acv=2) - q8 LM weight hash
- fixed chunk sample count (
cs), stride (cst), and LM frame length (fl) when the payload targets a fixed-length graph
An .ecdc file is one self-contained container. The file header is written
once, followed by one or more framed chunk payloads:
4 bytes magic: "ECDC"
1 byte version: 0
4 bytes metadata JSON byte length, big-endian u32
N bytes metadata JSON
repeated chunks:
4 bytes chunk payload length, big-endian u32
4 bytes CRC32 of the chunk payload, big-endian u32
M bytes chunk payload
The normal q8 LM .ecdc path always writes CRC-wrapped chunks. Chunk count is
not stored as a separate top-level field; decoders read chunk frames after the
metadata header and validate the count against the audio length and chunk
layout implied by metadata (al, cs, cst, fl).
Do not concatenate multiple .ecdc files to make one record payload. A record
spiral carries one complete .ecdc byte stream. That stream may contain many
framed chunks internally, but each independently playable record needs its own
container header and metadata.
Add the crate:
encodec-rs = { git = "https://github.com/wavey-ai/encodec-rs.git", features = ["onnx"] }Load the frame codec:
use encodec_rs::onnx::{ExecutionTarget, OnnxFrameCodec};
let mut codec = OnnxFrameCodec::from_dir(
"onnx-bundles/encodec_48khz_6kbps",
ExecutionTarget::Cpu,
)?;
println!("{:#?}", codec.metadata());On the Lori Asha - Westside premix test track, using LM-assisted .ecdc
encoding on both runtimes, the latest local comparison was:
| Codec | Bitrate | Encode | Decode | .ecdc size |
|---|---|---|---|---|
| upstream | 6 kbps | 39.97s | 42.77s | 112,942 bytes |
| upstream | 12 kbps | 44.73s | 49.30s | 239,325 bytes |
encodec-rs |
6 kbps | 27.74s | 26.41s | 116,454 bytes |
encodec-rs |
12 kbps | 31.46s | 30.13s | 243,944 bytes |
So the current Rust runtime is materially faster than upstream on both encode and decode, while payload size is still slightly larger than upstream.
On April 26, 2026, the same Lori Asha - Westside track was also tested on an
Apple M4 host using the new CoreML execution target and LM-assisted 6 kbps
.ecdc encode/decode:
| Runtime | Bitrate | Encode | Decode | .ecdc size |
|---|---|---|---|---|
encodec-rs CoreML (--coreml --coreml-compute-units cpu-and-gpu) |
6 kbps | 163.84s | 157.26s | 115,572 bytes |
That is roughly 5.9x slower than the current encodec-rs benchmark snapshot
above (27.74s encode / 26.41s decode at 6 kbps), so CoreML support is
functional on Apple Silicon but not yet competitive with the current Linux /
NVIDIA path.
On May 19, 2026, after splitting .ecdc from the concrete ONNX runtime, the
same Lori Asha - Westside 48 kHz stereo fixture was measured on an Apple M1
host using ONNX Runtime 1.25.1 CPU, release build, batch size 8, and
LM-assisted .ecdc with chunk CRC enabled:
| Runtime | Bitrate | Encode | Decode | .ecdc size |
vs native snapshot |
|---|---|---|---|---|---|
encodec-rs ONNX CPU on Apple M1 |
6 kbps | 101.44s | 105.67s | 121,816 bytes | 3.66x / 4.00x slower |
encodec-rs ONNX CPU on Apple M1 |
12 kbps | 126.48s | 143.18s | 255,061 bytes | 4.02x / 4.75x slower |
This confirms the trait/backend split did not change the neural runtime: Apple native performance still needs a real MLX/Metal frame backend rather than the current ONNX CPU path.
On the same frame models, the MLX archive export keeps only the initializers needed by the Swift/MLX runtime and the manifest needed to rebuild the graph:
| Bundle | Model | Initializers | Parameters | ONNX file | MLX safetensors |
|---|---|---|---|---|---|
6 kbps |
encode frame | 81 | 8,345,360 | 32M | 32M |
6 kbps |
decode frame | 78 | 7,951,766 | 31M | 30M |
12 kbps |
encode frame | 89 | 9,393,936 | 36M | 36M |
12 kbps |
decode frame | 82 | 8,476,054 | 33M | 32M |
The exported graphs still contain the same neural work as the ONNX benchmark:
convolutions, transposed convolutions, instance normalization, LSTMs, and RVQ
math. The Apple MLX runtime now loads these archives, evaluates native
encode_frame and decode_frame, and bridges q8 LM-assisted .ecdc
encode/decode through Rust with Swift/MLX frame callbacks.
On the same Apple M1 host as the ONNX CPU check above, the full Lori Asha - Westside fixture (208.509s, 48 kHz stereo) was measured through the Release
Apple test bundle with q8 LM entropy coding:
| Runtime | Mode | Bitrate | Encode | Decode | .ecdc size |
|---|---|---|---|---|---|
| Swift/MLX + Rust bridge | q8 LM | 6 kbps | 36.55s | 42.02s | 107,327 bytes |
| Swift/MLX + Rust bridge | q8 LM | 12 kbps | 43.89s | 46.76s | 232,944 bytes |
The q8 LM path is the only supported .ecdc payload path in this checkout.
What is done:
- pure Rust runtime path
- pure Rust
.ecdc - hosted LM-capable
6 kbpsand12 kbpsbundles - CPU / CUDA / CoreML / TensorRT execution targets
What is still missing:
- CLI resampling
- broader model coverage beyond the current
48 kHzstereo family - further compression-ratio tuning versus upstream