Shuffle Coding

Shuffle coding is a general method for optimal compression of unordered objects using bits-back coding. Data structures that can be compressed with our method include sets, multisets, graphs, hypergraphs, and others. Shuffle coding achieves state-of-the-art compression rates for unordered arrays (multisets), various molecule datasets and large network graphs, at practical, competitive speeds.

This implementation can be easily adapted to different data types and statistical models.

We published Practical Shuffle Coding at NeurIPS, based on our earlier ICLR publication Entropy Coding of Unordered Data Structures. This is the official implementation for both papers.

Features

The library has these optional features disabled by default:

experimental: Enable experimental algorithms, including complete shuffle coding on graphs based on nauty and Traces. Requires a C compiler on your system.
bench: Enable benchmarks used for the research experiments, includes experimental.

Running benchmarks

The binary allows to run benchmark experiments and requires the bench feature. To see available commands, run:

cargo run --release --features bench -- --help

Graph datasets are downloaded automatically as needed (TU, SZIP and REC).

To replicate experiments from our "Practical Shuffle Coding" paper, run ./practical.sh. To replicate experiments from our earlier paper "Entropy Coding of Unordered Data Structures", run ./complete.sh.

Code has been cleaned up and optimized since publication. Compression speeds are higher compared to the published results. Compression rates are unchanged.

Citing

If you find this code useful, please reference in your paper:

@article{kunze2024shuffle,
  title={Practical Shuffle Coding},
  author={Kunze, Julius and Severo, Daniel and van de Meent, Jan-Willem and Townsend, James},
  journal={NeurIPS},
  year={2024}
}

@article{kunze2024entropy,
  title={Entropy Coding of Unordered Data Structures},
  author={Kunze, Julius and Severo, Daniel and Zani, Giulio and van de Meent, Jan-Willem and Townsend, James},
  journal={ICLR},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENCE		LICENCE
README.md		README.md
complete.sh		complete.sh
practical.sh		practical.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shuffle Coding

Features

Running benchmarks

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Shuffle Coding

Features

Running benchmarks

Citing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages