Skip to content

[DRAFT] fix: cap decompress size hint when caller passes None#15

Draft
speckhard wants to merge 3 commits intoLeMaterial:mainfrom
speckhard:fix/decompress-cap
Draft

[DRAFT] fix: cap decompress size hint when caller passes None#15
speckhard wants to merge 3 commits intoLeMaterial:mainfrom
speckhard:fix/decompress-cap

Conversation

@speckhard
Copy link
Copy Markdown

Summary

atompack::compression::decompress (re-exported as atompack::decompress_bytes) previously had an unchecked compressed.len() * 100 (LZ4) / * 10 (Zstd) multiplication in the path where the caller passes expected_size = None:

  • The multiplication can overflow usize and panic on large inputs.
  • Even when it doesn't overflow, allocating up to 100× the compressed size on attacker-supplied or otherwise-large inputs is a memory-blowup hazard.
  • For LZ4 specifically, the result was cast to i32 unconditionally — values above i32::MAX (~2.1 GB) silently wrapped.

What changes

  • decompress uses saturating_mul for the None-hint heuristic and caps it at a new DEFAULT_MAX_DECOMPRESSED_SIZE = 1 << 30 (1 GiB).
  • LZ4 path now try_into()s the size hint to i32 and returns a clean Error::Compression if it doesn't fit.
  • Added two unit tests:
    • test_decompress_without_expected_size_does_not_panic — both codecs round-trip a small payload with None hint without panic.
    • test_decompress_lz4_rejects_size_hint_exceeding_i32 — passing Some(i32::MAX + 1) errors cleanly instead of wrapping.

Backward compat

  • atompack's own callers (database.rs, database_flat.rs) always pass Some(uncompressed_size) from the index entry. Internal happy-path behavior is unchanged.
  • External callers using the public decompress_bytes re-export with None:
    • Previously: panic on overflow (unhandled), or huge allocation that may OOM.
    • Now: capped at 1 GiB; if the actual decompressed output exceeds 1 GiB, the codec errors cleanly. Callers who need >1 GiB outputs should pass an explicit Some(...).
  • No change to on-disk format, no API signature change.

Test plan

  • cargo test -p atompack compression — 4 passed (2 existing + 2 new):
running 4 tests
test compression::tests::test_decompress_without_expected_size_does_not_panic ... ok
test compression::tests::test_compression_roundtrip ... ok
test compression::tests::test_decompress_lz4_rejects_size_hint_exceeding_i32 ... ok
test compression::tests::test_atom_compression ... ok

test result: ok. 4 passed; 0 failed
  • pytest atompack-py/tests/ -q after maturin develop --release — 139 passed, 1 skipped (no regression).

The decompress() function previously computed an implicit size hint as
compressed.len() * 100 (LZ4) or * 10 (Zstd) when the caller passed
expected_size = None. The unchecked multiplication can overflow usize
(panic) and even when it doesn't, allocating up to 100x the compressed
size on attacker-supplied or otherwise-large inputs is a clear
memory-blowup hazard.

atompack's own callers always pass Some(uncompressed_size) from the
on-disk index entry, so this is a defensive fix for external callers of
the public lib.rs::decompress_bytes re-export, not a regression in any
internal path.

Changes:
- Use saturating_mul + a 1 GiB cap (DEFAULT_MAX_DECOMPRESSED_SIZE) when
  expected_size = None.
- For LZ4, validate the hint fits in i32 before passing to lz4::block,
  returning a clear error if it doesn't (previously cast wrapped).
- Added two unit tests: one that exercises the None-hint path on both
  codecs, and one that confirms an oversized i32 hint errors cleanly.
- Extract auto_max_size() helper so the saturating + cap math is unit-
  testable without round-tripping a multi-MiB payload.
- New test_auto_max_size_caps_at_default_max asserts:
  - small inputs pass through unchanged,
  - usize::MAX saturates to the 1 GiB cap,
  - len*multiplier just above the cap clamps to the cap exactly.
  This actually exercises the saturate+clamp guard the PR claims to add.
- test_decompress_lz4_rejects_size_hint_exceeding_i32 now asserts the
  specific Error::Compression variant and string-matches "exceeds
  i32::MAX" so a future change that breaks the try_into guard but errors
  elsewhere will fail loudly instead of pass.
- Renamed test_decompress_without_expected_size_does_not_panic to
  ..._roundtrips_small_input to be honest: it's a happy-path roundtrip,
  not a guard test.
The new test_auto_max_size_caps_at_default_max test had two assert_eq!
calls that exceeded max_width=100 once DEFAULT_MAX_DECOMPRESSED_SIZE
was inlined. rustfmt's canonical form splits them across multiple lines.
Caught proactively by mirroring the cargo fmt --all --check pass that
PR LeMaterial#12 needed for its CI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant