Conversation
WalkthroughIntroduces comprehensive core ZSTD compression and decompression infrastructure: entropy coding (FSE, Huffman), compression strategies (double-fast, LDM, optimal, multithreaded), dictionary training (FASTCOVER, suffix arrays), seekable format support, benchmarking tools, and cross-platform threading primitives. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Compressor as ZSTD Compression<br/>Pipeline
participant MatchFinder as Match Finder<br/>(Double-Fast/LDM/Opt)
participant SeqEncoder as Sequence<br/>Encoder
participant Entropy as Entropy Coding<br/>(FSE/Huffman)
User->>Compressor: Input data + params
activate Compressor
Compressor->>MatchFinder: Block data
activate MatchFinder
MatchFinder-->>Compressor: Match sequences
deactivate MatchFinder
Compressor->>SeqEncoder: Sequences
activate SeqEncoder
SeqEncoder->>SeqEncoder: Build FSE CTables<br/>Select encoding type
SeqEncoder-->>Compressor: Encoded sequences
deactivate SeqEncoder
Compressor->>Entropy: Literals + encoded seqs
activate Entropy
Entropy->>Entropy: Huffman compress<br/>literals
Entropy->>Entropy: FSE compress<br/>sequence tables
Entropy-->>Compressor: Compressed block
deactivate Entropy
Compressor-->>User: Compressed frame
deactivate Compressor
sequenceDiagram
participant Dict as Dictionary
participant Matcher as Match Finder
participant MTPool as Thread Pool
participant Worker as Worker<br/>Compression Job
participant Output as Output Buffer
Dict->>Matcher: Load/train dictionary
Matcher->>MTPool: Submit jobs for blocks
activate MTPool
par Worker 1
MTPool->>Worker: Block + dict
activate Worker
Worker->>Worker: Compress block<br/>with match finding
Worker-->>Output: Compressed chunk
deactivate Worker
and Worker 2
MTPool->>Worker: Block + dict
activate Worker
Worker->>Worker: Compress block<br/>with match finding
Worker-->>Output: Compressed chunk
deactivate Worker
and Worker N
MTPool->>Worker: Block + dict
activate Worker
Worker->>Worker: Compress block<br/>with match finding
Worker-->>Output: Compressed chunk
deactivate Worker
end
deactivate MTPool
Output-->>Output: Aggregate results
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes
Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
|
||
| size_t ssnb = 0; | ||
| for (size_t sliceNb=0; sliceNb < nbSlices; ) { | ||
| ssnb = (ssnb + 1) % srcSlices.nbSlices; |
There was a problem hiding this comment.
Bug: First Slice Skipped by Loop Logic
The splitSlices function skips the first source slice at index 0. Variable ssnb is initialized to 0, but immediately incremented to 1 at the start of the loop before use, causing the first source slice to be skipped. This should either initialize ssnb to the last valid index or move the increment to the end of the loop iteration.
| size_t nbSrcBlocks = 0; | ||
| for (size_t ssnb=0; ssnb < srcSlices.nbSlices; ssnb++) { | ||
| size_t pos = 0; | ||
| while (pos <= srcSlices.capacities[ssnb]) { |
There was a problem hiding this comment.
Bug: Inconsistent Loop Overcounts, Risks Overflow
Inconsistent loop condition in splitSlices causes overcount. The first loop uses pos <= srcSlices.capacities[ssnb] to count blocks, but the second loop uses pos < srcSlices.capacities[ssnb]. When pos exactly equals capacity, the first loop counts an extra block that doesn't exist, leading to potential buffer overflows or incorrect allocation.
| * The combination step won't work otherwise */ | ||
| qsort(state->matches, state->nbMatches, sizeof(ZSTD_eDist_match), ZSTD_eDist_matchComp); | ||
|
|
||
| memcpy(combinedMatches, state->matches, sizeof(ZSTD_eDist_match)); |
There was a problem hiding this comment.
Bug: ZSTD: Uninitialized Data Corrupts Match State
The ZSTD_eDist_combineMatches function fails when state->nbMatches is 0. It initializes nbCombinedMatches to 1 and copies one uninitialized match from state->matches, then writes this garbage data back, incorrectly setting state->nbMatches to 1 instead of 0. The function needs early return or check when nbMatches is 0.
There was a problem hiding this comment.
Actionable comments posted: 25
🧹 Nitpick comments (24)
lib/decompress/huf_decompress.c (2)
192-304: Fast 4X args init and BIT_DStream reconstruction look consistent with upstream; call out key preconditions
HUF_DecompressFastArgs_init()andHUF_initRemainingDStream()correctly enforce:
- 64‑bit little‑endian only (
!MEM_isLittleEndian() || MEM_32bits()→ fallback).- Minimum input sizes (jump table + ≥8 bytes per stream) and
dtLog == HUF_DECODER_FAST_TABLELOG.- Strict bounds checks when reconstructing the
BIT_DStream_t(ip[stream] >= iend[stream] - 8,op[stream] <= segmentEnd).Given how tight this logic is, it’s important not to locally tweak the layout (jump table, stream boundaries, or
HUF_DECODER_FAST_TABLELOG) without simultaneously updating the matching compressor and the header format. As imported, this section looks correct and consistent with the bitstream helpers shown inlib/legacy/zstd_v03.c.
721-894: 4X1 fast path: invariants and fallback behavior are saneThe 4‑stream X1 fast path (
HUF_decompress4X1_usingDTable_internal_fast[_c_loop]and wrapper) maintains good safety properties:
- Only enabled when
HUF_ENABLE_FAST_DECODEand fast flags allow it; otherwise falls back to the conservative decoder viafallbackFn.- Enforces segment sizes and
op[3] <= oendboth in the fast loop and in the final per‑segment finish phase.- Uses
HUF_initRemainingDStream()+HUF_decodeStreamX1()withHUF_DECODER_FAST_TABLELOG, so the fast path and finish path share the same table log and corruption checks.I don’t see correctness or bounds issues in this region given those checks, and it’s beneficial that the fast loop exits cleanly with
ret==0to fall back when its preconditions aren’t met.lib/compress/huf_compress.c (1)
669-792: Huffman tree build and max-height enforcement mirror canonical zstd implementation
HUF_buildTree(),HUF_setMaxHeight(), andHUF_buildCTableFromTree()together:
- Build a full Huffman tree from sorted nodes, then propagate
nbBitsfrom parents to leaves.- Enforce a maximum depth (
maxNbBits) by:
- Clamping too-deep nodes, computing the excess cost, and then redistributing bits using
rankLast[]andZSTD_highbit32()to preserve the Kraft–McMillan constraint.- Derive canonical code values per rank in
HUF_buildCTableFromTree()and writenbBits/values back into theCTable, followed by a header write.The invariants (
sum 2^-nbBits == 1, no symbol deeper thanmaxNbBits, andtableLog <= HUF_TABLELOG_MAX) are respected and enforced, and there’s a hard error ifmaxNbBitscan’t be made to fit. This is exactly the sort of logic that’s best kept identical to upstream; I wouldn’t recommend refactoring it locally.contrib/seekable_format/zstdseek_decompress.c (3)
161-169: Guard against potentialsize_toverflow inZSTD_seekable_read_buff.
ZSTD_seekable_read_buff()usesif (buff->pos + n > buff->size)to bounds‑check. In theory, ifbuff->posis close toSIZE_MAX,buff->pos + ncan wrap and incorrectly pass the check. The rest of the code makes this unlikely, but the fix is trivial.Suggested tweak:
- if (buff->pos + n > buff->size) return -1; + if (n > buff->size - buff->pos) return -1;
489-582: Handle extremeoffset + lenedge cases more defensively.
ZSTD_seekable_decompress()clampslenwhenoffset + len > eos, but the sum is computed inunsigned long longand can wrap for extremely largeoffset/lencombinations. In such cases, the clamp may not trigger and the loop conditionzs->decompressedOffset < offset + lencan misbehave (e.g., potential infinite loop in pathological inputs).This is a corner case, but easy to harden by detecting overflow and/or rejecting requests where
offset > eosexplicitly.Example pattern:
unsigned long long const eos = zs->seekTable.entries[zs->seekTable.tableLen].dOffset; unsigned long long end; /* offset past end: treat as empty range or error */ if (offset > eos) { return ERROR(frameIndex_tooLarge); /* or 0, depending on API contract */ } /* detect overflow: end < offset implies wraparound */ end = offset + (unsigned long long)len; if (end < offset || end > eos) { end = eos; len = (size_t)(eos - offset); } else { len = (size_t)(end - offset); }This keeps internal arithmetic well‑defined even for extreme caller inputs.
374-455: Optional: free any existing seek table before loading a new one.
ZSTD_seekable_loadSeekTable()assignszs->seekTable.entrieswithout freeing any prior allocation. IfZSTD_seekable_initAdvanced()can be called multiple times on the sameZSTD_seekable(even if that’s not the primary intended usage), this will leak the old table.You could defensively free any existing table before assigning the new one:
- zs->seekTable.entries = entries; + free(zs->seekTable.entries); + zs->seekTable.entries = entries; zs->seekTable.tableLen = numFrames; zs->seekTable.checksumFlag = checksumFlag;doc/educational_decoder/zstd_decompress.c (1)
384-391: Consider simplifying when no dictionary is needed.The current implementation allocates and frees an empty dictionary on every call to
ZSTD_decompress, even when dictionary support is not required. For an educational decoder, consider either passingNULLdirectly toZSTD_decompress_with_dict(and handling it there), or documenting this design choice.</comment_end -->
contrib/match_finders/zstd_edist.c (3)
390-395: Comparator is correct but could avoid subtraction to be extra-safe
ZSTD_eDist_matchComp()orders matches bysrcIdxviareturn (l - r);. This is functionally correct as long assrcIdxvalues fit comfortably inS32and their difference cannot overflow, which appears to be an implicit constraint in this module. If you ever relax those constraints, consider rewriting as:-static int ZSTD_eDist_matchComp(const void* p, const void* q) -{ - S32 const l = ((ZSTD_eDist_match*)p)->srcIdx; - S32 const r = ((ZSTD_eDist_match*)q)->srcIdx; - return (l - r); -} +static int ZSTD_eDist_matchComp(const void* p, const void* q) +{ + S32 const l = ((const ZSTD_eDist_match*)p)->srcIdx; + S32 const r = ((const ZSTD_eDist_match*)q)->srcIdx; + if (l < r) return -1; + if (l > r) return 1; + return 0; +}Not urgent, but it future‑proofs against overflow if sizes ever grow.
440-460: Sequence conversion assumes caller provides enough capacity
ZSTD_eDist_convertMatchesToSequences()writes up tostate->nbMatchesZSTD_Sequenceentries intosequences, andstate->nbMatchesis bounded bysrcSizegiven the recursion structure. That’s logically consistent, but the public API doesn’t carry a capacity parameter, so the only way this is safe is if the contract inzstd_edist.hexplicitly requiressequencesto have capacity ≥srcSize.If that isn’t already documented, it would be good to make that requirement explicit in the header comment to avoid accidental buffer overruns by callers.
466-505: Helper distance functions are fine for tests; watch recursion depth on large inputs
ZSTD_eDist_hamingDist()(typo in name, probably should be “hamming”) and the recursiveZSTD_eDist_levenshteinDist()are clearly marked as test-oriented helpers. The implementations are straightforward and correct for smalln, but the Levenshtein function grows exponentially in time and linearly in recursion depth, so it will blow the stack on large inputs.Given the comment already warns against use on big files, no change is strictly required; just be sure these remain test-only helpers and are not wired into production paths.
lib/dictBuilder/zdict.c (1)
109-128: Consider reusing workspace forZDICT_getDictHeaderSize()in hot paths.
ZDICT_getDictHeaderSize()allocates a freshZSTD_compressedBlockState_tandHUF_WORKSPACE_SIZEbuffer on every call, then immediately frees them. This is fine functionally, but if this API is used in tight loops it will add noticeable malloc/free overhead.If this is on a hot path in your test harness, consider:
- Allowing the caller to provide a reusable scratch workspace, or
- Caching a single static workspace guarded by a simple mutex for test-only scenarios.
Not required for correctness, but an easy win if this ends up being called frequently.
lib/common/threading.c (2)
37-112: Stack‑allocatedZSTD_thread_params_tpattern is sound but tightly coupled to the init handshakeThe
worker()thread dereferences a pointer to the stack‑allocatedZSTD_thread_params_tinZSTD_pthread_create(), but only until it copies outstart_routine/argand signalsinitialized. BecauseZSTD_pthread_create()waits oninitializedbefore returning and only then destroys the mutex/cond, the stack lifetime is sufficient and there is no use‑after‑return.This pattern is easy to break if future changes reorder the wait/destruction or add new fields used after the signal. It’s worth documenting this dependency next to the struct or the handshake block to prevent subtle lifetime regressions.
140-180: Consider nulling mutex/cond pointers on destroy to harden against accidental double‑destroyThe malloc‑backed
ZSTD_pthread_mutex_destroy()/ZSTD_pthread_cond_destroy()free the underlyingpthread_*objects but leave*mutex/*condunchanged. A second destroy on the same wrapper would then callpthread_*_destroy()on freed memory.Current call sites appear to follow a strict init/destroy discipline, so this is not a functional bug, but setting
*mutex = NULL/*cond = NULLafterZSTD_free()would make the API more robust against accidental reuse or future refactors.lib/compress/zstd_ldm.c (3)
32-58: Guard against oversizedhashRateLogto avoid undefined shifts
ZSTD_ldm_gear_init()shifts1ULLbyhashRateLog(and uses it in mask computations). The code relies on external invariants (e.g.,hashRateLog < 32asserted inZSTDMT_serialState_reset()) to avoid shifting by ≥ 64 bits.Given this is a low‑level primitive, consider adding an internal
assert(hashRateLog < 64);(or even< 32to match callers) here. That makes the precondition explicit and protects against future call sites that might bypassZSTD_ldm_adjustParameters().
169-182:ZSTD_ldm_getTableSize/ZSTD_ldm_getMaxNbSeqassume parameters have been normalized firstBoth helpers assume
paramshas already gone throughZSTD_ldm_adjustParameters():
ZSTD_ldm_getTableSize()depends onhashLog >= bucketSizeLog.ZSTD_ldm_getMaxNbSeq()divides byparams.minMatchLength, which would be 0 if callers haven’t normalized params.Current callers in this PR appear to respect that order, but it would be safer to either document this requirement in a comment or add defensive
assert(params->minMatchLength > 0)/assert(params->hashLog >= params->bucketSizeLog)to catch misuse early.
526-600: Chunked LDM generation and leftover handling are correct but a bit opaque
ZSTD_ldm_generateSequences()’s chunk loop properly:
- Performs window overflow correction before generating sequences.
- Enforces max distance via
ZSTD_window_enforceMaxDist().- Uses
ZSTD_ldm_generateSequences_internal()per chunk and threadsleftoverSizethrough the first new sequence’slitLength.The logic for
leftoverSize(Lines 592–598) is subtle but consistent: when a chunk generates no sequences, it accumulates as additional literals; when it does, the previous leftover is folded into the first new sequence.Given how easy it is to regress this, consider a short comment near the
leftoverSizeupdate clarifying that it represents unmaterialized literals from prior chunks.lib/compress/zstdmt_compress.c (2)
90-141: Buffer pool reuse logic is correct; bitwise&is intentional but non‑idiomatic
ZSTDMT_getBuffer()uses:if ((availBufferSize >= bSize) & ((availBufferSize>>3) <= bSize)) { ... }Because both sub‑expressions are pure and yield
0or1, the bitwise&works like a non‑short‑circuiting&&. Functionally this is fine, and the size window check ([bSize/8, bSize*?]) is a standard zstd heuristic for reusing buffers without too much slack.If you ever touch this again, consider switching to
&&(or adding a short comment) to make it clearer to readers that this is not a bitmask operation.
487-567: SerialState init/reset correctly serializes LDM and checksum work, but allocator identity must be stable
ZSTDMT_serialState_reset():
- Adjusts LDM parameters and sizes the seq pool via
ZSTD_ldm_getMaxNbSeq().- Lazily reallocates
ldmState.hashTableandbucketOffsetswithparams.customMem.- Resets window and optionally pre‑loads the LDM hash table with raw dict content.
- Copies
ldmState.windowintoldmWindowfor readers underldmWindowMutex.
ZSTDMT_serialState_init()/_free()initialize/destroy the associated mutexes/conds and free the LDM buffers usingserialState->params.customMem.This design assumes
params.customMemis consistent across resets so that allocations and frees use the same allocator. Given all resets in this file pass the sameZSTD_CCtx_paramshierarchy, that assumption holds. If you ever support changingcustomMembetween frames, you’ll need to either free with the oldcMembefore overwritingserialState->paramsor store the allocator alongside each allocation.lib/common/fse.h (1)
231-275: Workspace sizing macros for build/decompress are conservative and consistentThe static‑linking‑only section defines:
FSE_BUILD_CTABLE_WORKSPACE_SIZE[_U32]FSE_BUILD_DTABLE_WKSP_SIZE[_U32]FSE_DECOMPRESS_WKSP_SIZE[_U32]These macros over‑approximate the worst case and are referenced in the doc comments for
FSE_buildCTable_wksp(),FSE_buildDTable_wksp(), andFSE_decompress_wksp_bmi2(). That’s appropriate for callers who want static allocation.One small nit: the comment for
FSE_buildDTable_wksp()still mentionsFSE_BUILD_DTABLE_WKSP_SIZE_U32(maxSymbolValue)while the macro’s signature includes bothmaxTableLogandmaxSymbolValue. Consider updating the comment for clarity, but behavior is correct.contrib/pzstd/Pzstd.cpp (5)
52-109: Threading and lifetime inhandleOneInputare sound, but consider zero‑initializing countersThe read/compress pipeline and the use of two
ThreadPools with aWorkQueuethat outlives them are consistent.bytesRead/bytesWrittenare always assigned before use in the normal flow, but they are currently left uninitialized. Initializing them to0would make the code more robust to future changes and early‑return paths.- std::uint64_t bytesRead; - std::uint64_t bytesWritten; + std::uint64_t bytesRead = 0; + std::uint64_t bytesWritten = 0;
111-130: Minor log message nit inopenInputFileWhen the input path is a directory you set the error to
"Output file is a directory -- ignored", but this is in the input‑file open routine. Consider adjusting the message to avoid confusion.- errorHolder.setError("Output file is a directory -- ignored"); + errorHolder.setError("Input file is a directory -- ignored");
266-329:compressstreaming logic is correct; consider guardingZSTD_compressBoundfailures explicitlyThe streaming pattern with a shared
outBuffer,ZSTD_compressStream, andZSTD_endStreamis idiomatic. You already check forZSTD_isError()on each call and foroutBuffer.empty()to detect an undersized buffer.One small robustness improvement: if
ZSTD_compressBound(maxInputSize)ever returns0(which is an error case), you’d currently construct a zero‑lengthBufferand later trigger the"ZSTD_compressBound() was too small"path. You could detect this earlier and set an explicit error:- auto outBuffer = Buffer(ZSTD_compressBound(maxInputSize)); + const size_t bound = ZSTD_compressBound(maxInputSize); + if (!errorHolder.check(bound != 0, "ZSTD_compressBound() failed")) { + return; + } + auto outBuffer = Buffer(bound);
351-384:FileStatus/readDatabehavior is reasonable; note large temporary buffer size
readData()’sBuffer buffer(size);plus chunkedfreadandfileStatus()checks are functionally correct. Just be aware that for largersizevalues this allocates a single contiguous buffer of that size; if you ever relax thecalculateStep()constraint, you may want to revisit this to avoid large allocations.
432-502:decompressis structured correctly; minor suggestion onreturnCodehandlingThe streaming decompression loop with
ZSTD_DCtx_reset,ZSTD_decompressStream, andZSTD_initDStreamwhenreturnCode == 0looks correct and matches typical zstd patterns. The final flush loop withreturnCode == 1is also standard.You already validate
returnCode <= 1at the end of the main loop. If you want slightly clearer intent, you could return early onreturnCode == 0and avoid the second loop altogether, but that’s stylistic and not required.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
contrib/largeNbDicts/largeNbDicts.c(1 hunks)contrib/match_finders/zstd_edist.c(1 hunks)contrib/pzstd/Pzstd.cpp(1 hunks)contrib/seekable_format/zstdseek_decompress.c(1 hunks)doc/educational_decoder/zstd_decompress.c(1 hunks)lib/common/fse.h(1 hunks)lib/common/threading.c(1 hunks)lib/compress/fse_compress.c(1 hunks)lib/compress/hist.c(1 hunks)lib/compress/huf_compress.c(1 hunks)lib/compress/zstd_compress_sequences.c(1 hunks)lib/compress/zstd_double_fast.c(1 hunks)lib/compress/zstd_ldm.c(1 hunks)lib/compress/zstd_opt.c(1 hunks)lib/compress/zstdmt_compress.c(1 hunks)lib/decompress/huf_decompress.c(1 hunks)lib/deprecated/zbuff_compress.c(1 hunks)lib/dictBuilder/divsufsort.c(1 hunks)lib/dictBuilder/fastcover.c(1 hunks)lib/dictBuilder/zdict.c(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (17)
lib/dictBuilder/fastcover.c (1)
lib/dictBuilder/zdict.c (1)
ZDICT_finalizeDictionary(862-941)
lib/dictBuilder/zdict.c (4)
lib/legacy/zstd_v03.c (6)
ERR_isError(525-525)U32(121-124)U32(161-164)U32(166-175)HUF_isError(1451-1451)FSE_isError(1106-1106)lib/compress/zstd_compress.c (11)
ZSTD_reset_compressedBlockState(1926-1935)ZSTD_loadCEntropy(5069-5158)ZSTD_compressBegin_usingCDict_deprecated(5881-5885)ZSTD_compressBlock_deprecated(4891-4898)ZSTD_getSeqStore(222-222)ZSTD_seqToCodes(2701-2727)ZSTD_getParams(7821-7825)ZSTD_createCDict_advanced(5650-5666)ZSTD_createCCtx(97-100)ZSTD_freeCDict(5740-5751)ZSTD_freeCCtx(181-192)lib/compress/huf_compress.c (2)
HUF_buildCTable_wksp(756-792)HUF_writeCTable_wksp(248-289)lib/compress/fse_compress.c (2)
FSE_normalizeCount(465-525)FSE_writeNCount(330-340)
contrib/largeNbDicts/largeNbDicts.c (3)
programs/util.c (6)
UTIL_getFileSize(545-558)UTIL_isDirectory(357-365)UTIL_getTotalFileSize(630-645)UTIL_createExpandedFNT(1379-1423)UTIL_assembleFileNamesTable(811-815)UTIL_freeFileNamesTable(817-823)lib/compress/zstd_compress.c (12)
ZSTD_createCDict_advanced2(5668-5716)ZSTD_createCCtx(97-100)ZSTD_compressCCtx(5495-5503)ZSTD_compress_usingCDict(5920-5927)ZSTD_CCtx_setParametersUsingCCtxParams(1186-1198)ZSTD_freeCCtx(181-192)ZSTD_CCtx_refCDict(1336-1344)ZSTD_sizeof_CDict(5550-5557)ZSTD_createCCtxParams(349-352)ZSTD_CCtxParams_init(366-372)ZSTD_CCtxParams_setParameter(778-1029)ZSTD_freeCCtxParams(354-359)lib/decompress/zstd_decompress.c (3)
ZSTD_createDCtx(310-314)ZSTD_freeDCtx(324-343)ZSTD_decompress_usingDDict(1656-1665)
doc/educational_decoder/zstd_decompress.c (4)
contrib/pzstd/Pzstd.cpp (1)
state(169-169)lib/decompress/zstd_decompress.c (1)
ZSTD_decompress(1203-1217)lib/legacy/zstd_v03.c (1)
ZSTD_decompress(2905-2910)lib/legacy/zstd_v02.c (1)
ZSTD_decompress(3266-3271)
lib/compress/zstd_compress_sequences.c (2)
lib/compress/fse_compress.c (5)
FSE_optimalTableLog(371-374)FSE_normalizeCount(465-525)FSE_writeNCount(330-340)FSE_buildCTable_rle(528-548)FSE_buildCTable_wksp(68-214)lib/legacy/zstd_v03.c (3)
ZSTD_isError(2373-2373)ERR_isError(525-525)MEM_32bits(107-107)
lib/compress/zstd_ldm.c (2)
lib/compress/zstd_double_fast.c (1)
ZSTD_fillDoubleHashTable(90-100)lib/compress/zstd_compress.c (1)
ZSTD_selectBlockCompressor(3077-3160)
contrib/pzstd/Pzstd.cpp (2)
lib/compress/zstd_compress.c (5)
ZSTD_CCtx_reset(1375-1390)ZSTD_compressBound(70-74)ZSTD_compressStream(6309-6313)ZSTD_endStream(7662-7675)ZSTD_CStreamInSize(5960-5960)lib/decompress/zstd_decompress.c (5)
ZSTD_DCtx_reset(1947-1962)ZSTD_DStreamOutSize(1697-1697)ZSTD_decompressStream(2086-2390)ZSTD_initDStream(1750-1756)ZSTD_DStreamInSize(1696-1696)
lib/deprecated/zbuff_compress.c (1)
lib/compress/zstd_compress.c (14)
ZSTD_createCStream(5935-5939)ZSTD_createCStream_advanced(5946-5949)ZSTD_freeCStream(5951-5954)ZSTD_CCtx_reset(1375-1390)ZSTD_CCtx_setPledgedSrcSize(1238-1245)ZSTD_checkCParams(1396-1406)ZSTD_CCtx_setParameter(716-776)ZSTD_CCtx_loadDictionary(1329-1333)ZSTD_initCStream(6085-6092)ZSTD_compressStream(6309-6313)ZSTD_flushStream(7655-7660)ZSTD_endStream(7662-7675)ZSTD_CStreamInSize(5960-5960)ZSTD_CStreamOutSize(5962-5965)
lib/compress/zstd_double_fast.c (1)
lib/legacy/zstd_v03.c (3)
U32(121-124)U32(161-164)U32(166-175)
lib/compress/zstdmt_compress.c (3)
lib/common/threading.c (4)
ZSTD_pthread_mutex_destroy(149-159)ZSTD_pthread_mutex_init(140-147)ZSTD_pthread_cond_init(161-168)ZSTD_pthread_cond_destroy(170-180)lib/compress/zstd_compress.c (12)
ZSTD_freeCCtx(181-192)ZSTD_createCCtx_advanced(114-124)ZSTD_sizeof_CCtx(206-214)ZSTD_referenceExternalSequences(4788-4797)ZSTD_compressBegin_advanced_internal(5293-5309)ZSTD_compressContinue_public(4861-4867)ZSTD_compressEnd_public(5415-5439)ZSTD_CCtx_trace(5391-5413)ZSTD_freeCDict(5740-5751)ZSTD_checkCParams(1396-1406)ZSTD_compressBound(70-74)ZSTD_writeLastEmptyBlock(4778-4786)lib/compress/zstd_ldm.c (4)
ZSTD_ldm_adjustParameters(135-167)ZSTD_ldm_getMaxNbSeq(179-182)ZSTD_ldm_fillHashTable(285-323)ZSTD_ldm_generateSequences(526-601)
lib/common/fse.h (2)
lib/compress/fse_compress.c (9)
FSE_compressBound(623-623)FSE_optimalTableLog(371-374)FSE_normalizeCount(465-525)FSE_NCountWriteBound(223-231)FSE_writeNCount(330-340)FSE_compress_usingCTable(610-620)FSE_optimalTableLog_internal(357-369)FSE_buildCTable_rle(528-548)FSE_buildCTable_wksp(68-214)lib/legacy/zstd_v03.c (5)
FSE_isError(1106-1106)FSE_readNCount(1117-1227)FSE_endOfDState(682-685)BYTE(660-669)BYTE(671-680)
lib/compress/fse_compress.c (1)
lib/common/fse.h (5)
BYTE(526-530)BYTE(540-549)BYTE(553-562)U32(475-479)U32(485-500)
contrib/seekable_format/zstdseek_decompress.c (2)
lib/decompress/zstd_decompress.c (5)
ZSTD_createDStream(1672-1676)ZSTD_freeDStream(1688-1691)ZSTD_initDStream(1750-1756)ZSTD_DCtx_reset(1947-1962)ZSTD_decompressStream(2086-2390)lib/legacy/zstd_v03.c (4)
ZSTD_isError(2373-2373)U32(121-124)U32(161-164)U32(166-175)
lib/compress/huf_compress.c (3)
lib/legacy/zstd_v03.c (7)
U32(121-124)U32(161-164)U32(166-175)BYTE(660-669)BYTE(671-680)HUF_readStats(1477-1553)MEM_32bits(107-107)lib/compress/hist.c (2)
HIST_count_simple(39-64)HIST_count_wksp(164-174)lib/compress/fse_compress.c (6)
FSE_optimalTableLog(371-374)FSE_normalizeCount(465-525)FSE_writeNCount(330-340)FSE_buildCTable_wksp(68-214)FSE_compress_usingCTable(610-620)FSE_optimalTableLog_internal(357-369)
lib/compress/zstd_opt.c (5)
lib/common/fse.h (8)
U32(475-479)U32(485-500)void(428-437)void(443-452)void(454-461)void(463-467)void(517-524)void(532-538)lib/legacy/zstd_v03.c (4)
U32(121-124)U32(161-164)U32(166-175)MEM_isLittleEndian(110-114)lib/compress/huf_compress.c (8)
HUF_getNbBitsFromCTable(345-352)void(556-569)void(878-910)void(912-916)void(922-929)void(938-958)void(985-989)void(991-1043)lib/compress/hist.c (1)
HIST_count_simple(39-64)lib/compress/zstd_compress.c (1)
ZSTD_resetSeqStore(3169-3174)
lib/decompress/huf_decompress.c (2)
lib/legacy/zstd_v03.c (14)
MEM_isLittleEndian(110-114)MEM_32bits(107-107)U64(126-129)U64(177-187)BYTE(660-669)BYTE(671-680)U32(121-124)U32(161-164)U32(166-175)size_t(190-196)size_t(325-368)size_t(369-373)size_t(377-381)BIT_endOfDStream(439-442)lib/common/fse.h (11)
BYTE(526-530)BYTE(540-549)BYTE(553-562)U32(475-479)U32(485-500)void(428-437)void(443-452)void(454-461)void(463-467)void(517-524)void(532-538)
lib/compress/hist.c (1)
lib/legacy/zstd_v03.c (1)
ERR_isError(525-525)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
| #include <stddef.h> /* size_t */ | ||
| #include <stdlib.h> /* malloc, free, abort, qsort*/ | ||
| #include <stdio.h> /* fprintf */ | ||
| #include <limits.h> /* UINT_MAX */ | ||
| #include <assert.h> /* assert */ | ||
|
|
||
| #include "util.h" | ||
| #include "benchfn.h" | ||
| #define ZSTD_STATIC_LINKING_ONLY | ||
| #include "zstd.h" | ||
| #include "zdict.h" | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
Include <string.h> for string/memory APIs to avoid implicit declarations
This file uses memcpy, strlen, strcmp, strncmp, strcpy, and strcat but never includes <string.h>. In C99+ this is at least a warning (often an error with -Werror), and on 64‑bit platforms implicit int return types can cause undefined behavior.
Please add the missing header:
#include <stddef.h> /* size_t */
#include <stdlib.h> /* malloc, free, abort, qsort*/
#include <stdio.h> /* fprintf */
#include <limits.h> /* UINT_MAX */
#include <assert.h> /* assert */
+#include <string.h> /* memcpy, strlen, strcmp, strncmp, strcpy, strcat */🏁 Script executed:
#!/bin/bash
# Verify string function usage in largeNbDicts.c
echo "=== Checking for string function calls ==="
rg -n '(memcpy|strlen|strcmp|strncmp|strcpy|strcat)\s*\(' contrib/largeNbDicts/largeNbDicts.c
echo ""
echo "=== Checking all #include statements ==="
rg -n '^\s*#include' contrib/largeNbDicts/largeNbDicts.c
echo ""
echo "=== Checking if <string.h> is included ==="
grep -c '#include\s*[<"]string\.h[>"]' contrib/largeNbDicts/largeNbDicts.c && echo "Found" || echo "Not found"Length of output: 1191
Add <string.h> for string/memory function declarations
The file uses memcpy, strlen, strcmp, strncmp, strcpy, and strcat (lines 361, 747–749, 968–969, 1033–1037) but does not include <string.h>. This causes implicit function declarations, which triggers compilation warnings or errors with strict compiler flags (e.g., -Werror or C99+ mode), and can lead to undefined behavior on 64-bit platforms.
Add the missing header:
#include <stddef.h> /* size_t */
#include <stdlib.h> /* malloc, free, abort, qsort*/
#include <stdio.h> /* fprintf */
#include <limits.h> /* UINT_MAX */
#include <assert.h> /* assert */
+#include <string.h> /* memcpy, strlen, strcmp, strncmp, strcpy, strcat */📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #include <stddef.h> /* size_t */ | |
| #include <stdlib.h> /* malloc, free, abort, qsort*/ | |
| #include <stdio.h> /* fprintf */ | |
| #include <limits.h> /* UINT_MAX */ | |
| #include <assert.h> /* assert */ | |
| #include "util.h" | |
| #include "benchfn.h" | |
| #define ZSTD_STATIC_LINKING_ONLY | |
| #include "zstd.h" | |
| #include "zdict.h" | |
| #include <stddef.h> /* size_t */ | |
| #include <stdlib.h> /* malloc, free, abort, qsort*/ | |
| #include <stdio.h> /* fprintf */ | |
| #include <limits.h> /* UINT_MAX */ | |
| #include <assert.h> /* assert */ | |
| #include <string.h> /* memcpy, strlen, strcmp, strncmp, strcpy, strcat */ | |
| #include "util.h" | |
| #include "benchfn.h" | |
| #define ZSTD_STATIC_LINKING_ONLY | |
| #include "zstd.h" | |
| #include "zdict.h" | |
🤖 Prompt for AI Agents
In contrib/largeNbDicts/largeNbDicts.c around lines 21 to 32, standard
string/memory functions (memcpy, strlen, strcmp, strncmp, strcpy, strcat) are
used but <string.h> is not included; add #include <string.h> to the header
section alongside the other standard includes (e.g., after <stdio.h> or with the
other system headers) so the functions are properly declared and eliminate
implicit-declaration warnings/errors.
| static slice_collection_t | ||
| splitSlices(slice_collection_t srcSlices, size_t blockSize, size_t nbSlices) | ||
| { | ||
| if (blockSize==0) blockSize = (size_t)(-1); /* means "do not cut" */ | ||
| size_t nbSrcBlocks = 0; | ||
| for (size_t ssnb=0; ssnb < srcSlices.nbSlices; ssnb++) { | ||
| size_t pos = 0; | ||
| while (pos <= srcSlices.capacities[ssnb]) { | ||
| nbSrcBlocks++; | ||
| pos += blockSize; | ||
| } | ||
| } | ||
|
|
||
| if (nbSlices == 0) nbSlices = nbSrcBlocks; | ||
|
|
||
| void** const sliceTable = (void**)malloc(nbSlices * sizeof(*sliceTable)); | ||
| size_t* const capacities = (size_t*)malloc(nbSlices * sizeof(*capacities)); | ||
| if (sliceTable == NULL || capacities == NULL) { | ||
| free(sliceTable); | ||
| free(capacities); | ||
| return kNullCollection; | ||
| } | ||
|
|
||
| size_t ssnb = 0; | ||
| for (size_t sliceNb=0; sliceNb < nbSlices; ) { | ||
| ssnb = (ssnb + 1) % srcSlices.nbSlices; | ||
| size_t pos = 0; | ||
| char* const ptr = (char*)srcSlices.slicePtrs[ssnb]; | ||
| while (pos < srcSlices.capacities[ssnb] && sliceNb < nbSlices) { | ||
| size_t const size = MIN(blockSize, srcSlices.capacities[ssnb] - pos); | ||
| sliceTable[sliceNb] = ptr + pos; | ||
| capacities[sliceNb] = size; | ||
| sliceNb++; | ||
| pos += blockSize; | ||
| } | ||
| } | ||
|
|
||
| slice_collection_t result; | ||
| result.nbSlices = nbSlices; | ||
| result.slicePtrs = sliceTable; | ||
| result.capacities = capacities; | ||
| return result; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Fix off‑by‑one in splitSlices() block counting to avoid extra / duplicated slices
In splitSlices() the pre‑allocation pass counts blocks with:
while (pos <= srcSlices.capacities[ssnb]) {
nbSrcBlocks++;
pos += blockSize;
}but the actual slicing loop uses while (pos < srcSlices.capacities[ssnb] && sliceNb < nbSlices). When a slice size is an exact multiple of blockSize (or zero), nbSrcBlocks ends up one larger than the number of distinct blocks produced, so the tail of the data gets duplicated to fill the extra entries. That inflates nbSlices/nbDicts and changes the distribution of data across blocks.
You likely just want strict < in the counting loop:
- while (pos <= srcSlices.capacities[ssnb]) {
+ while (pos < srcSlices.capacities[ssnb]) {
nbSrcBlocks++;
pos += blockSize;
}This keeps the pre‑computed block count consistent with the actual slicing logic.
I need to gather information from the codebase to verify this off-by-one claim. Let me inspect the file and trace through the logic.
</shell_parameter>
---
🏁 Script executed:
```shell
#!/bin/bash
# Find and display the splitSlices function to verify the code matches
fd -t f "largeNbDicts.c" --exec cat {} \;
Length of output: 37508
Fix off-by-one in splitSlices() block counting loop
The counting loop incorrectly uses <= while the slicing loop uses <. When a slice size is an exact multiple of blockSize, this causes nbSrcBlocks to be overcounted by one. When nbSlices defaults to nbSrcBlocks, extra entries are allocated and filled via round-robin cycling, duplicating or misplacing data blocks.
Change line ~243 from:
while (pos <= srcSlices.capacities[ssnb]) {to:
while (pos < srcSlices.capacities[ssnb]) {This aligns the counting phase with the actual slicing logic.
🤖 Prompt for AI Agents
In contrib/largeNbDicts/largeNbDicts.c around lines 237 to 279, the
block-counting loop uses <= causing an off-by-one when a slice capacity is an
exact multiple of blockSize; change the loop condition to use < so the counted
nbSrcBlocks matches the actual slices produced in the later slicing loop,
preventing overallocation and duplicate/misplaced blocks.
| static size_t compressBlocks(size_t* cSizes, /* optional (can be NULL). If present, must contain at least nbBlocks fields */ | ||
| slice_collection_t dstBlockBuffers, | ||
| slice_collection_t srcBlockBuffers, | ||
| ZSTD_CDict* cdict, int cLevel) | ||
| { | ||
| size_t const nbBlocks = srcBlockBuffers.nbSlices; | ||
| assert(dstBlockBuffers.nbSlices == srcBlockBuffers.nbSlices); | ||
|
|
||
| ZSTD_CCtx* const cctx = ZSTD_createCCtx(); | ||
| assert(cctx != NULL); | ||
|
|
||
| size_t totalCSize = 0; | ||
| for (size_t blockNb=0; blockNb < nbBlocks; blockNb++) { | ||
| size_t cBlockSize; | ||
| if (cdict == NULL) { | ||
| cBlockSize = ZSTD_compressCCtx(cctx, | ||
| dstBlockBuffers.slicePtrs[blockNb], dstBlockBuffers.capacities[blockNb], | ||
| srcBlockBuffers.slicePtrs[blockNb], srcBlockBuffers.capacities[blockNb], | ||
| cLevel); | ||
| } else { | ||
| cBlockSize = ZSTD_compress_usingCDict(cctx, | ||
| dstBlockBuffers.slicePtrs[blockNb], dstBlockBuffers.capacities[blockNb], | ||
| srcBlockBuffers.slicePtrs[blockNb], srcBlockBuffers.capacities[blockNb], | ||
| cdict); | ||
| } | ||
| CONTROL(!ZSTD_isError(cBlockSize)); | ||
| if (cSizes) cSizes[blockNb] = cBlockSize; | ||
| totalCSize += cBlockSize; | ||
| } | ||
| return totalCSize; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Free ZSTD_CCtx in compressBlocks() to avoid leaking the compression context
compressBlocks() allocates a ZSTD_CCtx with ZSTD_createCCtx() but never frees it. Since compressBlocks() is called at least twice (cTotalSizeNoDict and cTotalSize), this leaks context memory each time the tool runs.
You can fix the leak and make allocation failure fatal using CONTROL:
static size_t compressBlocks(size_t* cSizes,
slice_collection_t dstBlockBuffers,
slice_collection_t srcBlockBuffers,
ZSTD_CDict* cdict, int cLevel)
{
size_t const nbBlocks = srcBlockBuffers.nbSlices;
assert(dstBlockBuffers.nbSlices == srcBlockBuffers.nbSlices);
- ZSTD_CCtx* const cctx = ZSTD_createCCtx();
- assert(cctx != NULL);
+ ZSTD_CCtx* const cctx = ZSTD_createCCtx();
+ CONTROL(cctx != NULL);
size_t totalCSize = 0;
for (size_t blockNb=0; blockNb < nbBlocks; blockNb++) {
...
}
- return totalCSize;
+ ZSTD_freeCCtx(cctx);
+ return totalCSize;
}I'll verify the memory leak claim by examining the actual code and confirming the function call sites.
---
🏁 Script executed:
```shell
#!/bin/bash
# Verify the compressBlocks function and find all call sites
rg -n "compressBlocks" contrib/largeNbDicts/largeNbDicts.c -A2 -B2
Length of output: 963
🏁 Script executed:
#!/bin/bash
# Extract the actual code at lines 538-568
sed -n '538,568p' contrib/largeNbDicts/largeNbDicts.cLength of output: 1424
🏁 Script executed:
#!/bin/bash
# Search for ZSTD_freeCCtx usage to understand the pattern
rg -n "ZSTD_freeCCtx|ZSTD_createCCtx" contrib/largeNbDicts/ -B1 -A1Length of output: 981
Memory leak in compressBlocks() confirmed—add ZSTD_freeCCtx() before return
The verification confirms the memory leak. compressBlocks() allocates a ZSTD_CCtx at line 546 but never frees it. The function is called twice per execution (lines 855 and 864), causing context memory to leak each run. ZSTD_freeCCtx() is already used in the codebase (line 595), and CONTROL() is the appropriate error check pattern here (see line 584).
Apply the proposed fix: replace assert(cctx != NULL) with CONTROL(cctx != NULL) and add ZSTD_freeCCtx(cctx) before the return statement.
🤖 Prompt for AI Agents
In contrib/largeNbDicts/largeNbDicts.c around lines 538 to 568, compressBlocks()
creates a ZSTD_CCtx with ZSTD_createCCtx() then returns without freeing it
causing a memory leak; replace the assert(cctx != NULL) with CONTROL(cctx !=
NULL) to use the project's error handling and ensure you call
ZSTD_freeCCtx(cctx) before the function returns (i.e., just before the final
return totalCSize) so the context is always freed after compression loop
completes.
| /* benched function */ | ||
| size_t compress(const void* src, size_t srcSize, void* dst, size_t dstCapacity, void* payload) | ||
| { | ||
| compressInstructions* const ci = (compressInstructions*) payload; | ||
| (void)dstCapacity; | ||
|
|
||
| ZSTD_CCtx_refCDict(ci->cctx, ci->dictionaries.cdicts[ci->dictNb]); | ||
| ZSTD_compress2(ci->cctx, | ||
| dst, srcSize, | ||
| src, srcSize); | ||
|
|
||
| ci->dictNb = ci->dictNb + 1; | ||
| if (ci->dictNb >= ci->nbDicts) ci->dictNb = 0; | ||
|
|
||
| return srcSize; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Use dstCapacity and check Zstd error codes in compress() to avoid silent failures
compress() currently:
- Ignores the
dstCapacityargument from the benchmark harness. - Passes
srcSizeas the destination capacity toZSTD_compress2(). - Ignores the return values of both
ZSTD_CCtx_refCDict()andZSTD_compress2()and always returnssrcSize.
For incompressible data, srcSize can be smaller than the actual compressed size (which was previously sized into dstBlocks.capacities), causing ZSTD_compress2() to fail with dstSizeTooSmall while the benchmark still reports success and uses the elapsed time.
A safer implementation keeps the throughput calculation (returning srcSize) but uses the real capacity and checks Zstd errors:
size_t compress(const void* src, size_t srcSize, void* dst, size_t dstCapacity, void* payload)
{
compressInstructions* const ci = (compressInstructions*) payload;
- (void)dstCapacity;
-
- ZSTD_CCtx_refCDict(ci->cctx, ci->dictionaries.cdicts[ci->dictNb]);
- ZSTD_compress2(ci->cctx,
- dst, srcSize,
- src, srcSize);
+ size_t const refResult =
+ ZSTD_CCtx_refCDict(ci->cctx, ci->dictionaries.cdicts[ci->dictNb]);
+ CONTROL(!ZSTD_isError(refResult));
+
+ size_t const cSize =
+ ZSTD_compress2(ci->cctx,
+ dst, dstCapacity,
+ src, srcSize);
+ CONTROL(!ZSTD_isError(cSize));
ci->dictNb = ci->dictNb + 1;
if (ci->dictNb >= ci->nbDicts) ci->dictNb = 0;
- return srcSize;
+ /* Benchmark speed is in terms of input bytes per second. */
+ return srcSize;
}This preserves the intended metric (MB/s over input size) while ensuring any Zstd errors fail fast instead of silently skewing results.
I'll verify this review comment by examining the codebase and the context around the identified issues.
---
🏁 Script executed:
```shell
#!/bin/bash
# Verify the actual code at the specified location
cat contrib/largeNbDicts/largeNbDicts.c | sed -n '621,636p'
Length of output: 474
🏁 Script executed:
#!/bin/bash
# Find the CONTROL macro definition
rg -n "define CONTROL" contrib/largeNbDicts/Length of output: 82
🏁 Script executed:
#!/bin/bash
# Find other usages of ZSTD_compress2 for comparison
rg -n "ZSTD_compress2" contrib/largeNbDicts/ -C 3Length of output: 457
🏁 Script executed:
#!/bin/bash
# Search for ZSTD_CCtx_refCDict usage patterns
rg -n "ZSTD_CCtx_refCDict" contrib/largeNbDicts/ -C 2Length of output: 349
🏁 Script executed:
#!/bin/bash
# Find dstBlocks setup to understand capacity allocation
rg -n "dstBlocks" contrib/largeNbDicts/largeNbDicts.c -C 2Length of output: 793
🏁 Script executed:
#!/bin/bash
# Check how dstCapacity is used in the benchmark harness
rg -n "dstCapacity" contrib/largeNbDicts/largeNbDicts.c -B 2 -A 2Length of output: 811
Let me verify the Zstd API behavior to confirm the silent failure scenario:
🌐 Web query:
ZSTD_compress2 destination buffer size smaller than worst case compression error behavior
💡 Result:
Short answer: If dstCapacity < required worst-case (ZSTD_compressBound(srcSize)), ZSTD_compress2 fails and returns an error code (a size_t that makes ZSTD_isError() true). It will not silently overflow the destination buffer — you must detect the error with ZSTD_isError() or ZSTD_getErrorName() (which will report "Destination buffer is too small"). Use ZSTD_compressBound(srcSize) to size dst to guarantee success. [1][2]
Sources:
- Zstd API manual: ZSTD_compressBound recommendation and return/error behavior. [1]
- Function docs / bindings showing ZSTD_compress2 returns compressed size or an error code (check with ZSTD_isError). Also community note about "Destination buffer is too small". [2]
Use dstCapacity and check Zstd error codes in compress() to avoid silent failures
The review identifies a confirmed bug in compress() (lines 621–636):
dstCapacityis cast away (line 625), andsrcSizeis passed as the destination capacity toZSTD_compress2()(line 629)- For incompressible data, worst-case compression can exceed
srcSizein size ZSTD_compress2()will fail with a "Destination buffer is too small" error, but this error is silently ignored, skewing benchmark resultsZSTD_CCtx_refCDict()return value is also unchecked
The proposed fix is correct:
- Use the actual
dstCapacityallocated by the benchmark harness - Check error returns with the
CONTROLmacro (already defined at line 55) - Preserve the throughput metric by returning
srcSize - The
decompress()function (line 639) already demonstrates the correct pattern
size_t compress(const void* src, size_t srcSize, void* dst, size_t dstCapacity, void* payload)
{
compressInstructions* const ci = (compressInstructions*) payload;
- (void)dstCapacity;
-
- ZSTD_CCtx_refCDict(ci->cctx, ci->dictionaries.cdicts[ci->dictNb]);
- ZSTD_compress2(ci->cctx,
- dst, srcSize,
- src, srcSize);
+ size_t const refResult =
+ ZSTD_CCtx_refCDict(ci->cctx, ci->dictionaries.cdicts[ci->dictNb]);
+ CONTROL(!ZSTD_isError(refResult));
+
+ size_t const cSize =
+ ZSTD_compress2(ci->cctx,
+ dst, dstCapacity,
+ src, srcSize);
+ CONTROL(!ZSTD_isError(cSize));
ci->dictNb = ci->dictNb + 1;
if (ci->dictNb >= ci->nbDicts) ci->dictNb = 0;
- return srcSize;
+ /* Benchmark speed is in terms of input bytes per second. */
+ return srcSize;
}🤖 Prompt for AI Agents
In contrib/largeNbDicts/largeNbDicts.c around lines 621 to 636, the compress()
function ignores dstCapacity and fails to check ZSTD API error returns; change
the ZSTD_CCtx_refCDict call to capture and check its return value with the
CONTROL macro, call ZSTD_compress2 using the actual dstCapacity (not srcSize) as
the destination capacity, capture the return value of ZSTD_compress2 and
validate it with CONTROL (preserving the benchmark's throughput metric by still
returning srcSize), and handle any errors via CONTROL so silent "destination
buffer too small" or refCDict failures are not ignored.
| double *const speedPerRound = (double *)malloc(nbRounds * sizeof(double)); | ||
|
|
||
| BMK_timedFnState_t* const benchState = | ||
| BMK_createTimedFnState(total_time_ms, ms_per_round); | ||
|
|
||
| decompressInstructions di = createDecompressInstructions(ddictionaries); | ||
| compressInstructions ci = | ||
| createCompressInstructions(cdictionaries, cctxParams); | ||
| void* payload = benchCompression ? (void*)&ci : (void*)&di; | ||
| BMK_benchParams_t const bp = { | ||
| .benchFn = benchCompression ? compress : decompress, | ||
| .benchPayload = payload, | ||
| .initFn = NULL, | ||
| .initPayload = NULL, | ||
| .errorFn = ZSTD_isError, | ||
| .blockCount = dstBlocks.nbSlices, | ||
| .srcBuffers = (const void* const*) srcBlocks.slicePtrs, | ||
| .srcSizes = srcBlocks.capacities, | ||
| .dstBuffers = dstBlocks.slicePtrs, | ||
| .dstCapacities = dstBlocks.capacities, | ||
| .blockResults = NULL | ||
| }; | ||
|
|
||
| size_t roundNb = 0; | ||
| for (;;) { | ||
| BMK_runOutcome_t const outcome = BMK_benchTimedFn(benchState, bp); | ||
| CONTROL(BMK_isSuccessful_runOutcome(outcome)); | ||
|
|
||
| BMK_runTime_t const result = BMK_extract_runTime(outcome); | ||
| double const dTime_ns = result.nanoSecPerRun; | ||
| double const dTime_sec = (double)dTime_ns / 1000000000; | ||
| size_t const srcSize = result.sumOfReturn; | ||
| double const speed_MBps = (double)srcSize / dTime_sec / (1 MB); | ||
| speedPerRound[roundNb] = speed_MBps; | ||
| if (benchCompression) | ||
| DISPLAY("Compression Speed : %.1f MB/s \r", speed_MBps); | ||
| else | ||
| DISPLAY("Decompression Speed : %.1f MB/s \r", speed_MBps); | ||
|
|
||
| fflush(stdout); | ||
| if (BMK_isCompleted_TimedFn(benchState)) break; | ||
| roundNb++; | ||
| } | ||
| DISPLAY("\n"); | ||
| /* BMK_benchTimedFn may not run exactly nbRounds iterations */ | ||
| double speedAggregated = | ||
| aggregateData(speedPerRound, roundNb + 1, metricAggregatePref); | ||
| if (metricAggregatePref == fastest) | ||
| DISPLAY("Fastest Speed : %.1f MB/s \n", speedAggregated); | ||
| else | ||
| DISPLAY("Median Speed : %.1f MB/s \n", speedAggregated); | ||
|
|
||
| char* csvFileName = malloc(strlen(exeName) + 5); | ||
| strcpy(csvFileName, exeName); | ||
| strcat(csvFileName, ".csv"); | ||
| FILE* csvFile = fopen(csvFileName, "r"); | ||
| if (!csvFile) { | ||
| csvFile = fopen(csvFileName, "wt"); | ||
| assert(csvFile); | ||
| fprintf(csvFile, "%s\n", exeName); | ||
| /* Print table headers */ | ||
| fprintf( | ||
| csvFile, | ||
| "Compression/Decompression,Level,nbDicts,dictAttachPref,metricAggregatePref,Speed\n"); | ||
| } else { | ||
| fclose(csvFile); | ||
| csvFile = fopen(csvFileName, "at"); | ||
| assert(csvFile); | ||
| } | ||
|
|
||
| int cLevel = -1; | ||
| int dictAttachPref = -1; | ||
| if (benchCompression) { | ||
| ZSTD_CCtxParams_getParameter(cctxParams, ZSTD_c_compressionLevel, | ||
| &cLevel); | ||
| ZSTD_CCtxParams_getParameter(cctxParams, ZSTD_c_forceAttachDict, | ||
| &dictAttachPref); | ||
| } | ||
| fprintf(csvFile, "%s,%d,%ld,%d,%d,%.1f\n", | ||
| benchCompression ? "Compression" : "Decompression", cLevel, | ||
| benchCompression ? ci.nbDicts : di.nbDicts, dictAttachPref, | ||
| metricAggregatePref, speedAggregated); | ||
| fclose(csvFile); | ||
| free(csvFileName); | ||
|
|
||
| freeDecompressInstructions(di); | ||
| freeCompressInstructions(ci); | ||
| BMK_freeTimedFnState(benchState); | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
Harden benchMem() heap usage and clean up speedPerRound / CSV filename allocations
In benchMem():
double *speedPerRound = malloc(nbRounds * sizeof(double));is never checked forNULLbefore indexing, which will segfault on OOM.speedPerRoundis never freed, leaking memory for each run.char* csvFileName = malloc(strlen(exeName) + 5);is also used without a NULL check beforestrcpy/strcat.
You can make these allocations robust and release them at the end:
- double *const speedPerRound = (double *)malloc(nbRounds * sizeof(double));
+ double *const speedPerRound = (double *)malloc(nbRounds * sizeof(double));
+ CONTROL(speedPerRound != NULL);
@@
- char* csvFileName = malloc(strlen(exeName) + 5);
+ char* csvFileName = malloc(strlen(exeName) + 5);
+ CONTROL(csvFileName != NULL);
@@
- fclose(csvFile);
- free(csvFileName);
-
- freeDecompressInstructions(di);
+ fclose(csvFile);
+ free(csvFileName);
+
+ freeDecompressInstructions(di);
freeCompressInstructions(ci);
BMK_freeTimedFnState(benchState);
+ free(speedPerRound);
return 0; /* success */I'll verify the review comment by examining the actual code and checking for NULL validation and memory cleanup.
---
🏁 Script executed:
```shell
#!/bin/bash
# Extract the relevant section of code around lines 695-783
sed -n '695,783p' contrib/largeNbDicts/largeNbDicts.c
Length of output: 3391
🏁 Script executed:
#!/bin/bash
# Search for NULL checks on speedPerRound in the entire benchMem function
rg -n "speedPerRound" contrib/largeNbDicts/largeNbDicts.c -A2 -B2Length of output: 788
🏁 Script executed:
#!/bin/bash
# Search for free() calls related to speedPerRound
rg -n "free.*speedPerRound|speedPerRound.*free" contrib/largeNbDicts/largeNbDicts.cLength of output: 0
🏁 Script executed:
#!/bin/bash
# Search for the benchMem function signature to understand return paths
rg -n "benchMem\s*\(" contrib/largeNbDicts/largeNbDicts.c -A1 -B1Length of output: 488
🏁 Script executed:
#!/bin/bash
# Search for CONTROL macro definition to understand its behavior
rg -n "define CONTROL|#define CONTROL" contrib/largeNbDicts/largeNbDicts.cLength of output: 46
🏁 Script executed:
#!/bin/bash
# Search for csvFileName NULL checks in the code
rg -n "csvFileName" contrib/largeNbDicts/largeNbDicts.c -A2 -B2Length of output: 691
🏁 Script executed:
#!/bin/bash
# Look for free(csvFileName) calls
rg -n "free.*csvFileName|csvFileName.*free" contrib/largeNbDicts/largeNbDicts.cLength of output: 26
Add NULL checks for speedPerRound and csvFileName allocations; free speedPerRound at cleanup
The code at lines 695–783 has two confirmed memory safety issues:
- Line 695:
speedPerRoundmalloc lacks a NULL check before indexing at line 728, causing segfault on OOM. - Line 695:
speedPerRoundis never freed, creating a memory leak. - Line 747:
csvFileNamemalloc lacks a NULL check beforestrcpy/strcatat lines 748–749.
Add CONTROL() checks for both allocations and free(speedPerRound) before function cleanup:
double *const speedPerRound = (double *)malloc(nbRounds * sizeof(double));
+ CONTROL(speedPerRound != NULL);
BMK_timedFnState_t* const benchState = char* csvFileName = malloc(strlen(exeName) + 5);
+ CONTROL(csvFileName != NULL);
strcpy(csvFileName, exeName); fclose(csvFile);
free(csvFileName);
+ free(speedPerRound);
freeDecompressInstructions(di);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| double *const speedPerRound = (double *)malloc(nbRounds * sizeof(double)); | |
| BMK_timedFnState_t* const benchState = | |
| BMK_createTimedFnState(total_time_ms, ms_per_round); | |
| decompressInstructions di = createDecompressInstructions(ddictionaries); | |
| compressInstructions ci = | |
| createCompressInstructions(cdictionaries, cctxParams); | |
| void* payload = benchCompression ? (void*)&ci : (void*)&di; | |
| BMK_benchParams_t const bp = { | |
| .benchFn = benchCompression ? compress : decompress, | |
| .benchPayload = payload, | |
| .initFn = NULL, | |
| .initPayload = NULL, | |
| .errorFn = ZSTD_isError, | |
| .blockCount = dstBlocks.nbSlices, | |
| .srcBuffers = (const void* const*) srcBlocks.slicePtrs, | |
| .srcSizes = srcBlocks.capacities, | |
| .dstBuffers = dstBlocks.slicePtrs, | |
| .dstCapacities = dstBlocks.capacities, | |
| .blockResults = NULL | |
| }; | |
| size_t roundNb = 0; | |
| for (;;) { | |
| BMK_runOutcome_t const outcome = BMK_benchTimedFn(benchState, bp); | |
| CONTROL(BMK_isSuccessful_runOutcome(outcome)); | |
| BMK_runTime_t const result = BMK_extract_runTime(outcome); | |
| double const dTime_ns = result.nanoSecPerRun; | |
| double const dTime_sec = (double)dTime_ns / 1000000000; | |
| size_t const srcSize = result.sumOfReturn; | |
| double const speed_MBps = (double)srcSize / dTime_sec / (1 MB); | |
| speedPerRound[roundNb] = speed_MBps; | |
| if (benchCompression) | |
| DISPLAY("Compression Speed : %.1f MB/s \r", speed_MBps); | |
| else | |
| DISPLAY("Decompression Speed : %.1f MB/s \r", speed_MBps); | |
| fflush(stdout); | |
| if (BMK_isCompleted_TimedFn(benchState)) break; | |
| roundNb++; | |
| } | |
| DISPLAY("\n"); | |
| /* BMK_benchTimedFn may not run exactly nbRounds iterations */ | |
| double speedAggregated = | |
| aggregateData(speedPerRound, roundNb + 1, metricAggregatePref); | |
| if (metricAggregatePref == fastest) | |
| DISPLAY("Fastest Speed : %.1f MB/s \n", speedAggregated); | |
| else | |
| DISPLAY("Median Speed : %.1f MB/s \n", speedAggregated); | |
| char* csvFileName = malloc(strlen(exeName) + 5); | |
| strcpy(csvFileName, exeName); | |
| strcat(csvFileName, ".csv"); | |
| FILE* csvFile = fopen(csvFileName, "r"); | |
| if (!csvFile) { | |
| csvFile = fopen(csvFileName, "wt"); | |
| assert(csvFile); | |
| fprintf(csvFile, "%s\n", exeName); | |
| /* Print table headers */ | |
| fprintf( | |
| csvFile, | |
| "Compression/Decompression,Level,nbDicts,dictAttachPref,metricAggregatePref,Speed\n"); | |
| } else { | |
| fclose(csvFile); | |
| csvFile = fopen(csvFileName, "at"); | |
| assert(csvFile); | |
| } | |
| int cLevel = -1; | |
| int dictAttachPref = -1; | |
| if (benchCompression) { | |
| ZSTD_CCtxParams_getParameter(cctxParams, ZSTD_c_compressionLevel, | |
| &cLevel); | |
| ZSTD_CCtxParams_getParameter(cctxParams, ZSTD_c_forceAttachDict, | |
| &dictAttachPref); | |
| } | |
| fprintf(csvFile, "%s,%d,%ld,%d,%d,%.1f\n", | |
| benchCompression ? "Compression" : "Decompression", cLevel, | |
| benchCompression ? ci.nbDicts : di.nbDicts, dictAttachPref, | |
| metricAggregatePref, speedAggregated); | |
| fclose(csvFile); | |
| free(csvFileName); | |
| freeDecompressInstructions(di); | |
| freeCompressInstructions(ci); | |
| BMK_freeTimedFnState(benchState); | |
| double *const speedPerRound = (double *)malloc(nbRounds * sizeof(double)); | |
| CONTROL(speedPerRound != NULL); | |
| BMK_timedFnState_t* const benchState = | |
| BMK_createTimedFnState(total_time_ms, ms_per_round); | |
| decompressInstructions di = createDecompressInstructions(ddictionaries); | |
| compressInstructions ci = | |
| createCompressInstructions(cdictionaries, cctxParams); | |
| void* payload = benchCompression ? (void*)&ci : (void*)&di; | |
| BMK_benchParams_t const bp = { | |
| .benchFn = benchCompression ? compress : decompress, | |
| .benchPayload = payload, | |
| .initFn = NULL, | |
| .initPayload = NULL, | |
| .errorFn = ZSTD_isError, | |
| .blockCount = dstBlocks.nbSlices, | |
| .srcBuffers = (const void* const*) srcBlocks.slicePtrs, | |
| .srcSizes = srcBlocks.capacities, | |
| .dstBuffers = dstBlocks.slicePtrs, | |
| .dstCapacities = dstBlocks.capacities, | |
| .blockResults = NULL | |
| }; | |
| size_t roundNb = 0; | |
| for (;;) { | |
| BMK_runOutcome_t const outcome = BMK_benchTimedFn(benchState, bp); | |
| CONTROL(BMK_isSuccessful_runOutcome(outcome)); | |
| BMK_runTime_t const result = BMK_extract_runTime(outcome); | |
| double const dTime_ns = result.nanoSecPerRun; | |
| double const dTime_sec = (double)dTime_ns / 1000000000; | |
| size_t const srcSize = result.sumOfReturn; | |
| double const speed_MBps = (double)srcSize / dTime_sec / (1 MB); | |
| speedPerRound[roundNb] = speed_MBps; | |
| if (benchCompression) | |
| DISPLAY("Compression Speed : %.1f MB/s \r", speed_MBps); | |
| else | |
| DISPLAY("Decompression Speed : %.1f MB/s \r", speed_MBps); | |
| fflush(stdout); | |
| if (BMK_isCompleted_TimedFn(benchState)) break; | |
| roundNb++; | |
| } | |
| DISPLAY("\n"); | |
| /* BMK_benchTimedFn may not run exactly nbRounds iterations */ | |
| double speedAggregated = | |
| aggregateData(speedPerRound, roundNb + 1, metricAggregatePref); | |
| if (metricAggregatePref == fastest) | |
| DISPLAY("Fastest Speed : %.1f MB/s \n", speedAggregated); | |
| else | |
| DISPLAY("Median Speed : %.1f MB/s \n", speedAggregated); | |
| char* csvFileName = malloc(strlen(exeName) + 5); | |
| CONTROL(csvFileName != NULL); | |
| strcpy(csvFileName, exeName); | |
| strcat(csvFileName, ".csv"); | |
| FILE* csvFile = fopen(csvFileName, "r"); | |
| if (!csvFile) { | |
| csvFile = fopen(csvFileName, "wt"); | |
| assert(csvFile); | |
| fprintf(csvFile, "%s\n", exeName); | |
| /* Print table headers */ | |
| fprintf( | |
| csvFile, | |
| "Compression/Decompression,Level,nbDicts,dictAttachPref,metricAggregatePref,Speed\n"); | |
| } else { | |
| fclose(csvFile); | |
| csvFile = fopen(csvFileName, "at"); | |
| assert(csvFile); | |
| } | |
| int cLevel = -1; | |
| int dictAttachPref = -1; | |
| if (benchCompression) { | |
| ZSTD_CCtxParams_getParameter(cctxParams, ZSTD_c_compressionLevel, | |
| &cLevel); | |
| ZSTD_CCtxParams_getParameter(cctxParams, ZSTD_c_forceAttachDict, | |
| &dictAttachPref); | |
| } | |
| fprintf(csvFile, "%s,%d,%ld,%d,%d,%.1f\n", | |
| benchCompression ? "Compression" : "Decompression", cLevel, | |
| benchCompression ? ci.nbDicts : di.nbDicts, dictAttachPref, | |
| metricAggregatePref, speedAggregated); | |
| fclose(csvFile); | |
| free(csvFileName); | |
| free(speedPerRound); | |
| freeDecompressInstructions(di); | |
| freeCompressInstructions(ci); | |
| BMK_freeTimedFnState(benchState); |
🤖 Prompt for AI Agents
contrib/largeNbDicts/largeNbDicts.c lines 695-783: speedPerRound and csvFileName
allocations are unchecked and speedPerRound is never freed; add CONTROL()
NULL-checks immediately after allocating speedPerRound and csvFileName and
handle OOM (e.g. return/cleanup) before any use (indexing/strcpy/strcat), and
free(speedPerRound) during normal cleanup (before freeing other resources /
before BMK_freeTimedFnState) and in any early-exit error paths to avoid the
leak.
| size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc, | ||
| const void* dict, size_t dictSize, | ||
| ZSTD_parameters params, unsigned long long pledgedSrcSize) | ||
| { | ||
| if (pledgedSrcSize==0) pledgedSrcSize = ZSTD_CONTENTSIZE_UNKNOWN; /* preserve "0 == unknown" behavior */ | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_reset(zbc, ZSTD_reset_session_only), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setPledgedSrcSize(zbc, pledgedSrcSize), ""); | ||
|
|
||
| FORWARD_IF_ERROR(ZSTD_checkCParams(params.cParams), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_windowLog, params.cParams.windowLog), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_hashLog, params.cParams.hashLog), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_chainLog, params.cParams.chainLog), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_searchLog, params.cParams.searchLog), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_minMatch, params.cParams.minMatch), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_targetLength, params.cParams.targetLength), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_strategy, params.cParams.strategy), ""); | ||
|
|
||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_contentSizeFlag, params.fParams.contentSizeFlag), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_checksumFlag, params.fParams.checksumFlag), ""); | ||
| FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_dictIDFlag, params.fParams.noDictIDFlag), ""); | ||
|
|
||
| FORWARD_IF_ERROR(ZSTD_CCtx_loadDictionary(zbc, dict, dictSize), ""); | ||
| return 0; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Verify potential logic inversion with noDictIDFlag.
Line 91 passes params.fParams.noDictIDFlag directly to ZSTD_c_dictIDFlag. The parameter names suggest opposite meanings:
noDictIDFlag: implies 1 = "do NOT include dictionary ID"ZSTD_c_dictIDFlag: likely means 1 = "include dictionary ID"
If this interpretation is correct, the value should be inverted: !params.fParams.noDictIDFlag.
🏁 Script executed:
#!/bin/bash
# Verify the semantics of noDictIDFlag and ZSTD_c_dictIDFlag parameters
# Search for the definition and usage of noDictIDFlag in frame parameters
rg -nP -C3 '\bnoDictIDFlag\b' --type=c --type=h
# Search for ZSTD_c_dictIDFlag documentation and usage
rg -nP -C3 'ZSTD_c_dictIDFlag' --type=c --type=h
# Look for ZSTD_parameters and ZSTD_frameParameters structure definitions
rg -nP -C5 'typedef.*ZSTD_frameParameters|struct.*ZSTD_frameParameters' --type=c --type=hLength of output: 22284
Fix logic inversion at line 91: invert noDictIDFlag when passing to ZSTD_c_dictIDFlag.
The parameter semantics are inverted:
noDictIDFlag = 1means "do NOT include dictionary ID in frame header"ZSTD_c_dictIDFlag = 1means "include dictionary ID in frame header"
Line 91 passes the value directly without inversion, contradicting the semantic difference. The canonical implementation in lib/compress/zstd_compress.c inverts with CCtxParams->fParams.noDictIDFlag = !value, and this pattern is consistently used throughout the codebase. Correct line 91 to:
FORWARD_IF_ERROR(ZSTD_CCtx_setParameter(zbc, ZSTD_c_dictIDFlag, !params.fParams.noDictIDFlag), "");🤖 Prompt for AI Agents
In lib/deprecated/zbuff_compress.c around lines 72 to 95, the call that maps
params.fParams.noDictIDFlag to ZSTD_c_dictIDFlag is inverted: noDictIDFlag==1
means "do not include dict ID" while ZSTD_c_dictIDFlag==1 means "include dict
ID". Fix by passing the logical inverse of params.fParams.noDictIDFlag when
calling ZSTD_CCtx_setParameter for ZSTD_c_dictIDFlag so the semantics align with
the rest of the codebase.
| construct_BWT_indexes(const unsigned char *T, int *SA, | ||
| int *bucket_A, int *bucket_B, | ||
| int n, int m, | ||
| unsigned char * num_indexes, int * indexes) { | ||
| int *i, *j, *k, *orig; | ||
| int s; | ||
| int c0, c1, c2; | ||
|
|
||
| int mod = n / 8; | ||
| { | ||
| mod |= mod >> 1; mod |= mod >> 2; | ||
| mod |= mod >> 4; mod |= mod >> 8; | ||
| mod |= mod >> 16; mod >>= 1; | ||
|
|
||
| *num_indexes = (unsigned char)((n - 1) / (mod + 1)); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Document/validate indexes capacity contract in construct_BWT_indexes().
construct_BWT_indexes() computes *num_indexes from n and then writes into indexes[s / (mod + 1) - 1] at several points. The function assumes the caller has allocated indexes with at least *num_indexes elements, but this isn’t enforced here.
Since divbwt() exposes this as a public API, it would be safer to:
- Clearly document that
indexesmust have capacity for*num_indexesentries, or - Add a defensive check in
divbwt()afterconstruct_BWT_indexes()returns to ensure*num_indexesstays within the caller’s buffer size.
This avoids latent out‑of‑bounds writes if a caller mis-sizes indexes.
Also applies to: 1817-1829
---
🏁 Script executed:
```shell
#!/bin/bash
# Find construct_BWT_indexes function definition and surrounding code
rg -n "construct_BWT_indexes" lib/dictBuilder/divsufsort.c -A 100 -B 2 | head -150
Length of output: 3738
🏁 Script executed:
#!/bin/bash
# Find divbwt function and how it calls construct_BWT_indexes
rg -n "divbwt" lib/dictBuilder/divsufsort.c -A 50 -B 2Length of output: 1413
🏁 Script executed:
#!/bin/bash
# Look at the specific lines mentioned in the review (1817-1829)
sed -n '1810,1835p' lib/dictBuilder/divsufsort.cLength of output: 622
🏁 Script executed:
#!/bin/bash
# Search for how indexes array is allocated or sized by callers
rg -n "indexes" lib/dictBuilder/divsufsort.c -C 3 | grep -E "(malloc|alloc|sizeof.*indexes|num_indexes)" -A 2 -B 2Length of output: 864
🏁 Script executed:
#!/bin/bash
# Check function signatures and headers for documentation
sed -n '1,100p' lib/dictBuilder/divsufsort.c | grep -E "(divbwt|construct_BWT)" -A 10Length of output: 0
Add runtime bounds validation or explicit documentation for indexes buffer capacity in divbwt().
The construct_BWT_indexes() function writes to the indexes array at lines 1779, 1805, 1817, and 1827 using indices computed as s / (mod + 1) - 1 or (n - 1) / (mod + 1) - 1, then outputs the required capacity via *num_indexes at line 1762. However, divbwt() receives both indexes and num_indexes from the caller without validating that the caller allocated sufficient capacity.
The function should either:
- Document the precondition that
indexesmust be allocated with capacity for at least*num_indexesentries before callingdivbwt(), or - Add a bounds validation in
divbwt()afterconstruct_BWT_indexes()returns (e.g., at line 1897) to ensure the buffer was not overwritten.
Without this, a caller that under-allocates indexes will cause an out-of-bounds write.
🤖 Prompt for AI Agents
In lib/dictBuilder/divsufsort.c around lines 1748-1763 and specifically in
divbwt() around line 1897, add a runtime bounds check to ensure the caller
allocated enough space for indexes: change divbwt() to accept an additional
parameter (e.g. int indexes_capacity) or otherwise obtain the intended capacity,
call construct_BWT_indexes() as before, then immediately verify that
*num_indexes <= indexes_capacity; if the check fails, return an error or abort
with a clear diagnostic to avoid out-of-bounds writes. Alternatively (lighter
change), document in the divbwt() header comment the exact precondition that
callers must allocate at least (n-1)/(mod+1) entries for indexes (the same
formula set into *num_indexes) before calling.
| /* Initialize frequency array of size 2^f */ | ||
| ctx->freqs = (U32*)calloc(((U64)1 << f), sizeof(U32)); | ||
| if (ctx->freqs == NULL) { | ||
| DISPLAYLEVEL(1, "Failed to allocate frequency table \n"); | ||
| FASTCOVER_ctx_destroy(ctx); | ||
| return ERROR(memory_allocation); | ||
| } | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
Guard against size overflow when allocating frequency/segment tables (32‑bit builds).
FASTCOVER_ctx_init() and FASTCOVER_tryParameters() allocate arrays sized as ((U64)1 << f) elements. On 32‑bit platforms, when f is large (up to 31 per FASTCOVER_MAX_F), calloc()’s internal nmemb * size multiplication can overflow size_t, leading to a too‑small allocation and later out‑of‑bounds writes.
Consider computing the table size in size_t, checking for overflow, and clamping f accordingly before calling calloc(). For example:
- ctx->freqs = (U32*)calloc(((U64)1 << f), sizeof(U32));
+ {
+ size_t const tableSize = ((size_t)1 << f);
+ if (tableSize > SIZE_MAX / sizeof(U32)) {
+ FASTCOVER_ctx_destroy(ctx);
+ return ERROR(memory_allocation);
+ }
+ ctx->freqs = (U32*)calloc(tableSize, sizeof(U32));
+ }Apply a similar pattern for the freqs/segmentFreqs allocations in FASTCOVER_tryParameters() so the code is safe on 32‑bit targets as well.
Also applies to: 480-485
🏁 Script executed:
#!/bin/bash
# Examine lines 376-383 and surrounding context in fastcover.c
echo "=== Lines 376-383 context ==="
sed -n '370,390p' lib/dictBuilder/fastcover.c
echo -e "\n=== Lines 480-485 context ==="
sed -n '475,495p' lib/dictBuilder/fastcover.c
echo -e "\n=== Search for FASTCOVER_MAX_F definition ==="
grep -n "FASTCOVER_MAX_F" lib/dictBuilder/fastcover.c | head -10
echo -e "\n=== Search for segmentFreqs allocations and checks ==="
grep -n "segmentFreqs" lib/dictBuilder/fastcover.c | head -20Length of output: 2883
🏁 Script executed:
#!/bin/bash
# Get function context for line 597 and check for null-checks
echo "=== Function context around line 597 ==="
sed -n '585,620p' lib/dictBuilder/fastcover.c
echo -e "\n=== Search for null-check pattern near line 597 ==="
sed -n '597,605p' lib/dictBuilder/fastcover.c
echo -e "\n=== Full view of function containing line 597 ==="
sed -n '570,625p' lib/dictBuilder/fastcover.cLength of output: 4918
🏁 Script executed:
#!/bin/bash
# Check FASTCOVER_buildDictionary function signature and NULL handling
echo "=== FASTCOVER_buildDictionary signature ==="
grep -n "FASTCOVER_buildDictionary" lib/dictBuilder/fastcover.c | head -5
echo -e "\n=== FASTCOVER_buildDictionary implementation start ==="
sed -n '399,425p' lib/dictBuilder/fastcover.c
echo -e "\n=== Check if segmentFreqs is dereferenced without NULL guard ==="
sed -n '152,200p' lib/dictBuilder/fastcover.cLength of output: 3494
Add null-check for segmentFreqs allocation at line 597 before use; address size_t overflow risk in allocations.
The review correctly identifies a real null-check gap, but the analysis is misdirected. Lines 376–383 and 480–485 already have proper null-checks; the actual critical issue is at line 597 in ZDICT_trainFromBuffer_fastCover():
U16* segmentFreqs = (U16 *)calloc(((U64)1 << parameters.f), sizeof(U16));
const size_t tail = FASTCOVER_buildDictionary(&ctx, ctx.freqs, dictBuffer,
dictBufferCapacity, coverParams, segmentFreqs);segmentFreqs is passed directly to FASTCOVER_buildDictionary() without a null-check. Inside FASTCOVER_selectSegment(), it is dereferenced immediately (lines 177, 182, 187, 189, 205: segmentFreqs[idx]), leading to a null-pointer dereference if allocation fails.
Add null-check before line 599:
if (!segmentFreqs) {
DISPLAYLEVEL(1, "Failed to allocate segmentFreqs\n");
FASTCOVER_ctx_destroy(&ctx);
return ERROR(memory_allocation);
}Regarding the size_t overflow risk: it applies uniformly to all three allocation sites (lines 377, 480, 597). On 32-bit systems with f near FASTCOVER_MAX_F (31), the product (1 << f) * sizeof(U16/U32) overflows, causing calloc() to receive a truncated size. While this is a valid concern, the practical impact is limited by memory constraints. If addressing this is desired, compute allocation sizes in size_t with overflow checks before calling calloc().
🤖 Prompt for AI Agents
In lib/dictBuilder/fastcover.c around lines 376-383 and especially 597, add a
null-check immediately after allocating segmentFreqs (the U16* calloc at line
~597) and before passing it to FASTCOVER_buildDictionary: if allocation fails,
log "Failed to allocate segmentFreqs", destroy/cleanup ctx with
FASTCOVER_ctx_destroy(&ctx) and return ERROR(memory_allocation). Also add
defensive size_t overflow checks for the calloc size computations used at the
three allocation sites (lines ~377, ~480 and ~597): compute element_count and
element_size in size_t, verify element_count <= SIZE_MAX / element_size (or
similar) before calling calloc, and handle overflow by logging, cleaning up, and
returning an allocation error.
| U16* segmentFreqs = (U16 *)calloc(((U64)1 << parameters.f), sizeof(U16)); | ||
| const size_t tail = FASTCOVER_buildDictionary(&ctx, ctx.freqs, dictBuffer, | ||
| dictBufferCapacity, coverParams, segmentFreqs); | ||
| const unsigned nbFinalizeSamples = (unsigned)(ctx.nbTrainSamples * ctx.accelParams.finalize / 100); | ||
| const size_t dictionarySize = ZDICT_finalizeDictionary( | ||
| dict, dictBufferCapacity, dict + tail, dictBufferCapacity - tail, | ||
| samplesBuffer, samplesSizes, nbFinalizeSamples, coverParams.zParams); | ||
| if (!ZSTD_isError(dictionarySize)) { | ||
| DISPLAYLEVEL(2, "Constructed dictionary of size %u\n", | ||
| (unsigned)dictionarySize); | ||
| } | ||
| FASTCOVER_ctx_destroy(&ctx); | ||
| free(segmentFreqs); | ||
| return dictionarySize; |
There was a problem hiding this comment.
🧩 Analysis chain
Check segmentFreqs for allocation failure before use.
In ZDICT_trainFromBuffer_fastCover(), segmentFreqs is allocated with calloc() and then passed into FASTCOVER_buildDictionary() without a null check. On OOM this will dereference a null pointer inside FASTCOVER_buildDictionary().
Add a check and return a proper error code, mirroring the pattern used in FASTCOVER_tryParameters():
- U16* segmentFreqs = (U16 *)calloc(((U64)1 << parameters.f), sizeof(U16));
- const size_t tail = FASTCOVER_buildDictionary(&ctx, ctx.freqs, dictBuffer,
- dictBufferCapacity, coverParams, segmentFreqs);
+ U16* segmentFreqs = (U16*)calloc(((U64)1 << parameters.f), sizeof(U16));
+ if (segmentFreqs == NULL) {
+ FASTCOVER_ctx_destroy(&ctx);
+ return ERROR(memory_allocation);
+ }
+ const size_t tail = FASTCOVER_buildDictionary(&ctx, ctx.freqs, dictBuffer,
+ dictBufferCapacity, coverParams, segmentFreqs);🏁 Script executed:
#!/bin/bash
# Locate fastcover.c and examine lines 597-610
fd -t f "fastcover.c" | head -5Length of output: 27
🏁 Script executed:
#!/bin/bash
# Extract lines 597-610 to verify the current state
sed -n '597,610p' lib/dictBuilder/fastcover.cLength of output: 835
🏁 Script executed:
#!/bin/bash
# Search for FASTCOVER_buildDictionary function definition and examine its NULL handling
rg -n "FASTCOVER_buildDictionary" lib/dictBuilder/fastcover.c -A 30 -B 2Length of output: 4674
🏁 Script executed:
#!/bin/bash
# Search for FASTCOVER_tryParameters to see the pattern referenced in the review
rg -n "FASTCOVER_tryParameters" lib/dictBuilder/fastcover.c -A 40 -B 2 | head -80Length of output: 3502
🏁 Script executed:
#!/bin/bash
# Check if ctx allocation is checked in the same function
rg -n "ZDICT_trainFromBuffer_fastCover" lib/dictBuilder/fastcover.c -A 60 | head -100Length of output: 3278
🏁 Script executed:
#!/bin/bash
# Look for ERROR macro usage patterns in this file
rg -n "ERROR\(" lib/dictBuilder/fastcover.c -B 2 -A 1 | head -40Length of output: 1708
Add NULL check for segmentFreqs allocation before use.
In ZDICT_trainFromBuffer_fastCover(), segmentFreqs is allocated with calloc() at line 597 and immediately passed to FASTCOVER_buildDictionary() without checking for allocation failure. If calloc() returns NULL due to OOM, FASTCOVER_buildDictionary() will dereference the null pointer at line 422 when calling FASTCOVER_selectSegment().
Add a NULL check following the pattern used in FASTCOVER_tryParameters() (line 486):
- U16* segmentFreqs = (U16 *)calloc(((U64)1 << parameters.f), sizeof(U16));
- const size_t tail = FASTCOVER_buildDictionary(&ctx, ctx.freqs, dictBuffer,
- dictBufferCapacity, coverParams, segmentFreqs);
+ U16* segmentFreqs = (U16*)calloc(((U64)1 << parameters.f), sizeof(U16));
+ if (segmentFreqs == NULL) {
+ FASTCOVER_ctx_destroy(&ctx);
+ return ERROR(memory_allocation);
+ }
+ const size_t tail = FASTCOVER_buildDictionary(&ctx, ctx.freqs, dictBuffer,
+ dictBufferCapacity, coverParams, segmentFreqs);🤖 Prompt for AI Agents
In lib/dictBuilder/fastcover.c around lines 597 to 610, segmentFreqs is calloc'd
but not checked for NULL; add a NULL check immediately after calloc that handles
OOM the same way FASTCOVER_tryParameters() does: if calloc returns NULL, call
FASTCOVER_ctx_destroy(&ctx), optionally log an error, and return 0 (or the
function's error sentinel) without calling FASTCOVER_buildDictionary or using
segmentFreqs; ensure no further dereference occurs and resources are cleaned up.
| static size_t ZDICT_addEntropyTablesFromBuffer_advanced( | ||
| void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity, | ||
| const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, | ||
| ZDICT_params_t params) | ||
| { | ||
| int const compressionLevel = (params.compressionLevel == 0) ? ZSTD_CLEVEL_DEFAULT : params.compressionLevel; | ||
| U32 const notificationLevel = params.notificationLevel; | ||
| size_t hSize = 8; | ||
|
|
||
| /* calculate entropy tables */ | ||
| DISPLAYLEVEL(2, "\r%70s\r", ""); /* clean display line */ | ||
| DISPLAYLEVEL(2, "statistics ... \n"); | ||
| { size_t const eSize = ZDICT_analyzeEntropy((char*)dictBuffer+hSize, dictBufferCapacity-hSize, | ||
| compressionLevel, | ||
| samplesBuffer, samplesSizes, nbSamples, | ||
| (char*)dictBuffer + dictBufferCapacity - dictContentSize, dictContentSize, | ||
| notificationLevel); | ||
| if (ZDICT_isError(eSize)) return eSize; | ||
| hSize += eSize; | ||
| } | ||
|
|
||
| /* add dictionary header (after entropy tables) */ | ||
| MEM_writeLE32(dictBuffer, ZSTD_MAGIC_DICTIONARY); | ||
| { U64 const randomID = XXH64((char*)dictBuffer + dictBufferCapacity - dictContentSize, dictContentSize, 0); | ||
| U32 const compliantID = (randomID % ((1U<<31)-32768)) + 32768; | ||
| U32 const dictID = params.dictID ? params.dictID : compliantID; | ||
| MEM_writeLE32((char*)dictBuffer+4, dictID); | ||
| } | ||
|
|
||
| if (hSize + dictContentSize < dictBufferCapacity) | ||
| memmove((char*)dictBuffer + hSize, (char*)dictBuffer + dictBufferCapacity - dictContentSize, dictContentSize); | ||
| return MIN(dictBufferCapacity, hSize+dictContentSize); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Enforce dictContentSize <= dictBufferCapacity in ZDICT_addEntropyTablesFromBuffer[_advanced].
ZDICT_addEntropyTablesFromBuffer_advanced() computes
(char*)dictBuffer + dictBufferCapacity - dictContentSizeas the start of the existing content and later may memmove() from that region. If a caller passes dictContentSize > dictBufferCapacity via ZDICT_addEntropyTablesFromBuffer(), this pointer underflows and all subsequent accesses become out-of-bounds.
Since ZDICT_addEntropyTablesFromBuffer() is a public entry point, consider adding an explicit check:
size_t ZDICT_addEntropyTablesFromBuffer(...)
{
ZDICT_params_t params;
memset(¶ms, 0, sizeof(params));
+ if (dictContentSize > dictBufferCapacity) {
+ return ERROR(dstSize_tooSmall);
+ }
return ZDICT_addEntropyTablesFromBuffer_advanced(...);
}so misuse is caught early instead of causing UB.
Also applies to: 1129-1136
I'll verify the code to confirm the vulnerability and check the current state of the implementation.
---
🏁 Script executed:
```shell
#!/bin/bash
# Extract code at specified line ranges
echo "=== Lines 944-976 ==="
sed -n '944,976p' lib/dictBuilder/zdict.c
echo ""
echo "=== Lines 1129-1136 ==="
sed -n '1129,1136p' lib/dictBuilder/zdict.c
echo ""
echo "=== Search for ZDICT_addEntropyTablesFromBuffer function (non-advanced version) ==="
grep -n "ZDICT_addEntropyTablesFromBuffer(" lib/dictBuilder/zdict.c | head -20
Length of output: 2504
🏁 Script executed:
#!/bin/bash
# Find the full ZDICT_addEntropyTablesFromBuffer function (public entry point)
rg -n -A 10 "size_t ZDICT_addEntropyTablesFromBuffer\(" lib/dictBuilder/zdict.c --type cLength of output: 590
I need to check if there's any validation happening in called functions or if the public entry point is indeed public.
---
🏁 Script executed:
```shell
#!/bin/bash
# Check if ZDICT_addEntropyTablesFromBuffer is exported (public)
echo "=== Check if function is public (not static) ==="
grep -B 5 "^size_t ZDICT_addEntropyTablesFromBuffer" lib/dictBuilder/zdict.c | head -10
echo ""
echo "=== Check ZDICT_analyzeEntropy for any parameter validation ==="
rg -n -B 2 -A 15 "^static size_t ZDICT_analyzeEntropy" lib/dictBuilder/zdict.c | head -40
echo ""
echo "=== Check for any existing dictContentSize validation in this file ==="
rg -n "dictContentSize.*dictBufferCapacity|dictBufferCapacity.*dictContentSize" lib/dictBuilder/zdict.c
Length of output: 370
I need to re-run without the skip_cloning tag to access repository files.
---
🏁 Script executed:
```shell
#!/bin/bash
# Check if ZDICT_addEntropyTablesFromBuffer is public (not static)
echo "=== Check if function is public (not static) ==="
sed -n '1129p' lib/dictBuilder/zdict.c
echo ""
echo "=== Check beginning of ZDICT_analyzeEntropy for validation ==="
rg -n -A 20 "^static size_t ZDICT_analyzeEntropy" lib/dictBuilder/zdict.c | head -30
echo ""
echo "=== Check for any size comparisons in ZDICT_analyzeEntropy ==="
rg -n "if.*dictContentSize|if.*dictBufferCapacity" lib/dictBuilder/zdict.c | head -15
Length of output: 2350
I need to identify which function contains the validation at line 878 and whether it protects the vulnerable code.
---
🏁 Script executed:
```shell
#!/bin/bash
# Find the function containing line 878
echo "=== Context around line 878 - find function name ==="
sed -n '850,900p' lib/dictBuilder/zdict.c | head -60
echo ""
echo "=== Get function signature of the function at line 878 ==="
sed -n '800,880p' lib/dictBuilder/zdict.c | grep -E "^(static )?size_t|^(static )?ZDICT_" | tail -5
Length of output: 2352
Add validation in ZDICT_addEntropyTablesFromBuffer() to prevent pointer underflow.
The vulnerability is confirmed. The public entry point ZDICT_addEntropyTablesFromBuffer() at lines 1129–1136 has no validation for dictContentSize > dictBufferCapacity and directly forwards arguments to ZDICT_addEntropyTablesFromBuffer_advanced().
The advanced function at lines 944–976 performs unsafe pointer arithmetic:
- Line 960:
(char*)dictBuffer + dictBufferCapacity - dictContentSizepassed toZDICT_analyzeEntropy() - Line 965: same calculation in
XXH64()call - Line 971: same calculation in
memmove()
If dictContentSize > dictBufferCapacity, the pointer underflows, causing out-of-bounds access. The suggested check should be added at the public entry point (lines 1129–1136) to catch misuse early, mirroring the pattern seen in ZDICT_finalizeDictionary() (line 878).
Note
Introduce zstd CLI and legacy decoder sources with file I/O (incl. async), benchmarking, and dictionary training utilities.
zstdclimain program with options, progress, and multi-format support.fileio.c,fileio_asyncio.c), and common utils (util.c).benchzstd.c) and dictionary builder/trainer (dibio.c).lorem.c).zstd_v06.c) and v0.7 (zstd_v07.c).Written by Cursor Bugbot for commit bb206ca. Configure here.
Summary by CodeRabbit
New Features
Documentation
Chores