mossnano-rs

Rust-first tooling for MOSS-Audio-Tokenizer-Nano RVQ artifacts, with a small WASM layer for browser playback through ONNX Runtime Web.

The crate owns the deterministic codec-adjacent work:

.mossnano container parsing and writing
10-bit RVQ token packing and unpacking
metadata, duration, and bitrate accounting
streaming decode chunk scheduling
ONNX token layout conversion from [quantizer, frame] to [time, quantizer]
decoded PCM accumulation
PCM16 WAV writing
wasm-bindgen exports for browser and Node.js use

The neural model still runs through the official MOSS Nano ONNX graphs using onnxruntime-web. This keeps the Rust/WASM surface small and predictable while preserving the path to browser playback.

Status

This is an early experimental repo. Decode playback uses the official moss_audio_tokenizer_decode_step.onnx graph with transformer offsets and attention cache tensors carried across chunks. The cache position tensors must start at -1, matching the native model reset path.

moss_audio_tokenizer_decode_full.onnx is still useful for whole-file reference decodes. Do not reset that full graph independently for each playback chunk: that creates audible boundary artifacts and does not match native output.

Tested locally with MOSS-Audio-Tokenizer-Nano RVQ16 stereo artifacts at 48 kHz.

Streaming Decode

MOSS Nano emits one RVQ token frame per 3,840 decoded samples. At 48 kHz this is an 80 ms quantum, so second-based chunk targets must snap to whole token frames.

Target	Token frames	Actual duration
1.333 s	17	1.36 s
1.8 s	23	1.84 s

The WASM API exposes MossNanoDecodeStream:

const stream = new MossNanoDecodeStream(artifactBytes, 17);

while (stream.hasNext()) {
  const start = stream.nextStartFrame();
  const tokenFrames = stream.nextTokenFrames();
  const codes = stream.nextCodesTqI32();

  // Run decode_step.onnx with:
  // audio_codes: [1, tokenFrames, quantizers], int32
  // audio_code_lengths: [1], int32
  // plus the carried transformer/attention state tensors

  stream.pushDecodedPlanar(decodedPlanarF32, channels, decodedFrames);
}

const wavBytes = stream.finishPcm16Wav();

Rust handles chunk scheduling, token slicing, token transposition, decoded audio assembly, and WAV writing. JavaScript loads ONNX Runtime Web, invokes the stateful decoder graph for each chunk, and feeds every state output back into the next chunk.

Container Format

The current .mossnano container is intentionally tiny:

Bytes	Field
0..8	ASCII magic `MOSSNANO`
8..12	`sample_rate`, little-endian `u32`
12..16	`channels`, little-endian `u32`
16..20	`original_samples`, little-endian `u32`
20..24	`quantizers`, little-endian `u32`
24..28	`frames`, little-endian `u32`
28..32	`codebook_size`, little-endian `u32`
32..	LSB-first packed RVQ codes

For MOSS Nano RVQ16, codebook_size = 1024, so each token is packed into 10 bits. The packed code order is [quantizer, frame].

Native CLI

Inspect a .mossnano artifact:

cargo run -- info path/to/file.mossnano

Unpack codes to little-endian u16 values:

cargo run -- unpack-u16le path/to/file.mossnano target/codes.u16le

Browser And Node Setup

Install Rust and Node dependencies:

rustup target add wasm32-unknown-unknown
cargo install wasm-bindgen-cli --version 0.2.121
cd web
npm install
cd ..

Build the WASM package:

scripts/build-wasm.sh

Run the Rust/WASM smoke test:

node scripts/wasm-smoke.mjs

ONNX Weights

Weights are intentionally not committed. Download the official browser-oriented ONNX bundle into weights/:

scripts/download-onnx.sh

Expected files:

moss_audio_tokenizer_decode_full.onnx
moss_audio_tokenizer_decode_step.onnx
moss_audio_tokenizer_decode_shared.data
moss_audio_tokenizer_encode.onnx
moss_audio_tokenizer_encode.data
codec_browser_onnx_meta.json

Decode-only playback needs the decoder graph and shared decoder data, about 45 MB total. Encode plus decode needs about 90 MB.

Decode From Node

Decode a .mossnano artifact with the 1.333-second target. This uses the stateful decode_step graph by default:

node scripts/decode-node.mjs \
  --input path/to/file.mossnano \
  --output target/decoded.wav \
  --chunk-seconds 1.333

Decode with the 1.8-second target:

node scripts/decode-node.mjs \
  --input path/to/file.mossnano \
  --output target/decoded-1p8.wav \
  --chunk-seconds 1.8

You can also pass exact token-frame chunks:

node scripts/decode-node.mjs --input path/to/file.mossnano --chunk-frames 23

For a whole-file reference pass through decode_full.onnx, pass the full token frame count and opt into the full decoder:

node scripts/decode-node.mjs \
  --input path/to/file.mossnano \
  --output target/decoded-full.wav \
  --chunk-frames 50 \
  --decoder full

Compare a chunked output against a reference and inspect chunk joins:

node scripts/compare-wav-boundaries.mjs \
  --reference target/decoded-full.wav \
  --candidate target/decoded.wav \
  --chunk-frames 17

Browser Prototype

Start a local server:

cd web
npm run serve

Open http://localhost:8765/web/, choose a .mossnano file, and leave the model root as:

../weights/MOSS-Audio-Tokenizer-Nano-ONNX/

The page loads the Rust WASM package, fetches the ONNX decoder graph and shared external data, decodes chunk by chunk, and creates a playable WAV blob in the browser.

Development

Run the native tests:

cargo test

Run formatting and JS syntax checks:

cargo fmt --check
node --check scripts/decode-node.mjs
node --check web/mossnano-player.js

Generated files and downloaded weights are ignored by git:

target/
weights/
web/node_modules/
web/pkg/

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
scripts		scripts
src		src
web		web
weights/MOSS-Audio-Tokenizer-Nano-ONNX		weights/MOSS-Audio-Tokenizer-Nano-ONNX
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mossnano-rs

Status

Streaming Decode

Container Format

Native CLI

Browser And Node Setup

ONNX Weights

Decode From Node

Browser Prototype

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mossnano-rs

Status

Streaming Decode

Container Format

Native CLI

Browser And Node Setup

ONNX Weights

Decode From Node

Browser Prototype

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages