Summary
Phase 6.6 of #28: complete the legacy frame decoder set with v0.5, v0.6, v0.7 support. These three formats are closer to the final v1.0 spec than v0.1-v0.4, so the dispatcher infra + shared bitstream/Huffman/FSE primitives from 6.5 (#65) carry over with relatively small additions per version.
After this issue lands, libzstd.so.1 decodes every published zstd frame format from 2014 (v0.1) through current v1.5.7 — matching upstream's full ZSTD_LEGACY_SUPPORT=1 configuration.
Format versions covered by this issue
| Version |
Year |
Magic |
Notable differences from v1.0 |
| v0.5 |
2015 |
0xFD2FB525 |
Block format converged toward v1.0 layout. Dictionary improvements. Skippable frames introduced. |
| v0.6 |
2016 |
0xFD2FB526 |
Frame Content Size encoding refined. Single-segment flag. |
| v0.7 |
2016 |
0xFD2FB527 |
Last format before v1.0 stabilization. Almost-identical block + sequence layouts; primary differences in frame header reserved bits and checksum placement. |
By v0.5 the format had converged enough that v0.5→v1.0 decoder shares a substantial fraction of its bitstream / Huffman / FSE entropy reader implementation. Expectation: v0.5 is the largest of the three (~700 LoC); v0.6 and v0.7 are smaller deltas (~500-600 LoC each) on top of v0.5's primitives.
Deliverables
1. Extend decoding/legacy/
decoding/legacy/
├── mod.rs — dispatcher updated to route v0.5/v0.6/v0.7 magic
├── v01..v04.rs — from #65
├── v05.rs — v0.5 decoder
├── v06.rs — v0.6 decoder (delta on v05)
└── v07.rs — v0.7 decoder (delta on v06)
Where possible, share primitives between v0.5/v0.6/v0.7 via legacy/shared/ (FSE table reader, Huffman tree decoder if the layout stabilized by v0.5, sequence section parser). Don't try to share with v0.1-v0.4 — those formats diverged enough that the shared layer would be a tangle of match version { ... } arms.
2. C FFI surface (no new symbols)
The dispatcher in 6.5 (#65) already exports ZSTD_isLegacy / ZSTD_decompressLegacy. This issue extends the dispatcher's per-version arm but doesn't add new FFI symbols.
3. Tests
decoding/legacy/tests/v05_corpus.rs — vendored v0.5 archive corpus.
decoding/legacy/tests/v06_corpus.rs — vendored v0.6 archive corpus.
decoding/legacy/tests/v07_corpus.rs — vendored v0.7 archive corpus.
- Cross-version dispatch: a single test that feeds an archive of each version (concatenated frames) and asserts each segment decodes correctly.
cli/tests/legacy_dispatch.rs extended to cover v0.5/v0.6/v0.7.
4. Coverage of ZSTD_LEGACY_SUPPORT=N levels
Match upstream's ZSTD_LEGACY_SUPPORT macro values:
Cargo features:
legacy (default: on) — currently all versions.
- After this issue:
legacy-v1, legacy-v2, ..., legacy-v7 granular features, matching upstream's per-version exclusion. Default-on still bundles them all.
Out of scope
- Legacy encoding — upstream doesn't expose legacy encoders. We don't either.
- v0.8 / v1.0-beta — never published as stable, no archives exist in the wild.
Acceptance criteria
Estimate
~10-12 working days (~1700 LoC including shared primitives + cross-version tests). Smaller than #65 because v0.5-v0.7 are closer to v1.0 and share more infra.
Blocked by
References
Summary
Phase 6.6 of #28: complete the legacy frame decoder set with v0.5, v0.6, v0.7 support. These three formats are closer to the final v1.0 spec than v0.1-v0.4, so the dispatcher infra + shared bitstream/Huffman/FSE primitives from 6.5 (#65) carry over with relatively small additions per version.
After this issue lands,
libzstd.so.1decodes every published zstd frame format from 2014 (v0.1) through current v1.5.7 — matching upstream's fullZSTD_LEGACY_SUPPORT=1configuration.Format versions covered by this issue
0xFD2FB5250xFD2FB5260xFD2FB527By v0.5 the format had converged enough that v0.5→v1.0 decoder shares a substantial fraction of its bitstream / Huffman / FSE entropy reader implementation. Expectation: v0.5 is the largest of the three (~700 LoC); v0.6 and v0.7 are smaller deltas (~500-600 LoC each) on top of v0.5's primitives.
Deliverables
1. Extend
decoding/legacy/Where possible, share primitives between v0.5/v0.6/v0.7 via
legacy/shared/(FSE table reader, Huffman tree decoder if the layout stabilized by v0.5, sequence section parser). Don't try to share with v0.1-v0.4 — those formats diverged enough that the shared layer would be a tangle ofmatch version { ... }arms.2. C FFI surface (no new symbols)
The dispatcher in 6.5 (#65) already exports
ZSTD_isLegacy/ZSTD_decompressLegacy. This issue extends the dispatcher's per-version arm but doesn't add new FFI symbols.3. Tests
decoding/legacy/tests/v05_corpus.rs— vendored v0.5 archive corpus.decoding/legacy/tests/v06_corpus.rs— vendored v0.6 archive corpus.decoding/legacy/tests/v07_corpus.rs— vendored v0.7 archive corpus.cli/tests/legacy_dispatch.rsextended to cover v0.5/v0.6/v0.7.4. Coverage of
ZSTD_LEGACY_SUPPORT=NlevelsMatch upstream's
ZSTD_LEGACY_SUPPORTmacro values:ZSTD_LEGACY_SUPPORT=7(max): all versions decoded — what this issue + perf(bench): add rust/ffi delta benchmark artifacts #65 deliver.ZSTD_LEGACY_SUPPORT=N(1..6): cuts off at version N, older versions excluded.ZSTD_LEGACY_SUPPORT=0: legacy support fully disabled — same as--no-default-featuresfrom perf(bench): add rust/ffi delta benchmark artifacts #65.Cargo features:
legacy(default: on) — currently all versions.legacy-v1,legacy-v2, ...,legacy-v7granular features, matching upstream's per-version exclusion. Default-on still bundles them all.Out of scope
Acceptance criteria
legacy-v1...legacy-v7) compose correctly: each feature flag includes/excludes the right module.ZSTD_decompressdispatches correctly for all seven legacy magic numbers (combined with perf(bench): add rust/ffi delta benchmark artifacts #65 corpus).--features legacy(all on) and--no-default-features(all off).Estimate
~10-12 working days (~1700 LoC including shared primitives + cross-version tests). Smaller than #65 because v0.5-v0.7 are closer to v1.0 and share more infra.
Blocked by
References
lib/legacy/zstd_v0{5,6,7}.{c,h}v1.5.7