refactor: refcounted CRDT lifecycle, drop arena#17
Merged
Conversation
- hashtable_create(NULL) allocates via host_malloc; hashtable_create(arena) unchanged. - hashtable_destroy frees nodes + key copies + table struct in host_malloc mode; no-op for arena-backed (arena handles it). - hashtable_remove / hashtable_clear free their nodes in host_malloc mode (otherwise leak). - 6 new tests for the host_malloc path; existing arena tests untouched. - Foundation step for migrating primitives off arena one at a time.
- scalar_clone(NULL, value) allocates string bytes via host_malloc. - scalar_clone(arena, value) unchanged. - Empty-string clone is a value passthrough — no allocation, no leak. - scalar_free releases host_malloc'd string bytes; no-op for non-string kinds and empty strings. - OOM message now reflects which allocator failed. - 8 new tests for the host_malloc path including empty-string + embedded NUL + survives-source-free.
Drop arena backing. counter_create allocates via host_malloc with refcount=1; counter_acquire/release manage lifetime, freeing per-client entries, hashtable, and struct at refcount 0. Add counter_displace / counter_is_displaced — a per-instance signal (independent of refcount) for the Map slot path to mark an LWW-evicted handle. counter_clone returns a fresh refcount=1, never-displaced copy.
Mirror the Counter conversion. register_create allocates via host_malloc with refcount=1; acquire/release free the value's string bytes (scalar_free) and struct at refcount 0. Values deep-copied via scalar_clone(NULL, ...); set and winning merge now scalar_free the old value before storing the new one, closing the per-overwrite string leak the arena model masked. Add register_displace / register_is_displaced. Refresh header ownership/lifetime docs; drop the stale arena.h include.
Convert Element and Map off arena ownership onto the refcount/displace model (completing the lifecycle refactor alongside Counter and Register). Element forwards lifecycle to its composite: element_acquire / _release / _displace / _is_displaced dispatch by kind (SCALAR is a no-op for acquire/displace and scalar_free on release); element_clone drops the arena param and produces refcount=1 children. Map slots hold refcounted composites under Share semantics: an accepted map_set acquires the slot's own ref, so callers always retain and release their own handle regardless of the LWW outcome. Eviction (winning set, delete, merge LWW-replace) displaces then releases the loser. map_get and the helper install path return borrows; the helper detached path (losing LWW) returns an owned, born-displaced handle. map_clone deep-copies with refcount=1; map_release recursively drops slot refs. Tests rewritten for the new contract (test_element +35, test_map +76), including displacement, borrow/retain, and detached-handle coverage. Verified green under ASan (no use-after-free / double-free); note macOS ASan ships no LeakSanitizer, so the leak class needs Linux CI.
The CRDT core moved fully to refcount/host_malloc ownership, leaving the arena allocator and its dual-mode seams with no production consumer. Strip the arena branch from hashtable and scalar: hashtable_create() and scalar_clone(value) lose the Arena* param and allocate via host_malloc only. Drop the arena.h includes, the HashTable.arena field, and the arena-mode test groups (now redundant with the host_malloc path). Delete arena.c / arena.h / test_arena.c and the test-arena make target. Full suite green; hashtable / scalar / map clean under ASan.
map_set released entry->value before it was initialized for brand-new
entries (and tombstones), and map_clone left a cloned tombstone's value
uninitialized. With glibc's chunk recycling a fresh Entry can carry bytes
that look like a live ELEMENT_COUNTER whose refcount reads 0, so the
spurious release aborted ("counter_release: refcount already zero") on
Linux while staying silent on macOS.
Release a displaced value only when overwriting an existing live slot
(present && !is_tombstone), and seed every tombstone value with a
null-scalar sentinel so no path ever releases an uninitialized handle.
3393f47 to
22f26aa
Compare
element_release was a no-op for SCALAR, so every scalar string stored in a Map slot (via scalar_clone) leaked when the slot was overwritten or the Map freed. Make it scalar_free the value (valid because slots only ever hold owned copies). In map_merge, the mismatched-id LWW path overwrote dst's composite without releasing the loser, and the clone+set fallthrough never dropped the element_clone create-ref that map_set's own acquire duplicates. Both now release. Suite is leak-free under macOS `leaks`.
…k-free The early counter tests and the hashtable tests created handles via the fresh() helpers without tearing them down. Add counter_release / hashtable_destroy at the end of each so the whole suite reports zero leaks (gating-ready for a Linux ASan+LSan CI job).
Second job builds the suite with -fsanitize=address,undefined and runs it; LeakSanitizer (bundled into ASan on Linux) gates memory leaks, which macOS ASan cannot catch. -fno-sanitize-recover=all makes UBSan findings fail the build.
- map_release aborts on refcount-zero before decrementing (mirrors counter_release/register_release) instead of wrapping size_t to SIZE_MAX - map_is_displaced takes const Map * for consistency with the other *_is_displaced and to allow const callers - register.h: drop stale scalar_clone(NULL, ...) wording (now scalar_clone)
map_set unconditionally displaced+released the prior slot value on an accepted overwrite. When the incoming value is the exact composite already installed, that flagged a still-installed handle as displaced (and churned its refcount), which would make the Doc layer treat a live handle as orphaned. Skip acquire/displace/release for the same-pointer case and just advance the stamp. Also document in element.h that element_release on a SCALAR frees the string bytes, so it is valid only on owned scalars — never on a borrowed scalar or a SCALAR returned by map_get.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 26 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
hashtable.c:62
- _hashtable_insert treats host_malloc(key_len) returning NULL as OOM. For key_len==0, malloc(0) is implementation-defined and may return NULL even when allocation succeeds logically, which would make zero-length keys impossible on some platforms. scalar_clone already special-cases len==0 for this exact portability reason; hashtable should do the same (e.g., allocate 1 byte when key_len==0 and skip memcpy).
node->key = host_malloc(key_len);
if (!node->key) {
host_free(node);
return HASHTABLE_ERR_OOM;
}
memcpy(node->key, key, key_len);
node->value = value;
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Convert the CRDT core from arena-lifetime ownership to refcount + displacement, then remove the arena allocator.
*_acquire/*_release, freed at refcount 0*_displace/*_is_displaced— marks a handle evicted from its slot (the Doc layer skips op emission on orphaned handles)scalar_freeon release)map_setacquires its own ref, so the caller always retains and releases its own handle; eviction displaces+releases the loser;map_getand helper install return borrows; helper detached path returns an owned, born-displaced handlehashtable_create()/scalar_clone(value)lose theArena*param (host_malloc only); deletearena.c/arena.h/test_arena.c+ the dual-mode test groupsmacOS ASan ships no LeakSanitizer, so the leak class needs Linux ASan+LSan in CI.
Summary by cubic
Refactored the CRDT core from arena-backed lifetime to refcounted
host_mallocwith acquire/release/displace, and switchedMapto Share semantics. Fixed tombstone and scalar-string leaks, and avoided displacing on same-pointer re-sets; CI now runs Linux ASan+UBSan+LSan to gate leaks.Refactors
Counter,Register,Map: host-allocated with refcount=1 on*_create,*_acquire/*_release, and*_displace/*_is_displaced; free internals at refcount 0; displacement is independent of refcount.Element: forwards acquire/release/displace by kind;element_clonereturns refcount=1 children; SCALAR release frees string bytes.Map(Share semantics): acceptedmap_setacquires a slot ref;map_getreturns borrows; eviction displaces+releases the loser; detached path returns an owned, born-displaced handle; deep clones on LWW replace;map_releaserecursively drops slot refs.HashTable:hashtable_create()now useshost_malloc; addedhashtable_destroy(); nodes and key copies are freed on destroy/clear/remove.Scalar:scalar_clone(value)allocates bytes viahost_malloc(empty strings pass through);scalar_freereleases them.arena.c/.hand tests; removedarena.hincludes; tests/Makefile updated; added Linux CI job running ASan+UBSan+LSan.Bug Fixes
map_setno longer displaces or releases when writing the exact composite already installed; only advances the stamp.map_mergerelease LWW losers and drop the extraelement_cloneref; tests now release/destroy handles so the suite is leak-free.map_releaseaborts on refcount-zero before decrementing;map_is_displacedtakesconst Map *;element.hdocuments SCALAR release is valid only for owned scalars;register.hdocs updated.Written for commit ec20fe5. Summary will update on new commits.