fix(trie): generational GC to end the prune/commit missing-node race by satyakwok · Pull Request #798 · sentrix-labs/sentrix

satyakwok · 2026-06-06T08:02:41Z

Problem

The periodic trie prune (maybe_prune_trie, every 5000 blocks) runs on a background thread whose live-set is a snapshot frozen at spawn time. Blocks committed during the multi-minute GC walk write new nodes/values the snapshot never saw → the old gc_table deleted them as "orphans". #791 split the node/value passes + re-augmented the live-set to narrow the window, but (per its own comment) the complete fix is "one RW txn around walk+delete".

The window persisted: testnet val3 hit trie: missing node in create_block pre-apply, so every time it was BFT proposer the round ate a 20s propose timeout → chain crawled to 0.13 blk/s until a restart reloaded the trie from MDBX. Recurring across the fleet.

Fix — generational (deferred) GC

Instead of deleting an orphan immediately, the prune tombstones it (TABLE_TRIE_TOMBSTONES, keyed disc‖hash → version) and only deletes it on a LATER prune if it's still orphan then.

A hash committed during this prune is orphan vs the frozen snapshot → tombstoned this cycle — but next cycle it's a recent live node → its tombstone is dropped instead of deleting it. Race closed. Worst-case failure mode is benign: under-deletion (storage grows one extra cycle), never deletion of a live entry. Deletion lags one prune interval (~5000 blocks).

Safety

No consensus/state change — trie content + state_root identical; only dead-node deletion timing moves. No fork gate needed.
New tombstone table auto-creates on env open (ALL_TABLES) → existing chain.dbs need no migration.
Not the eventual single-RW-txn fix, but race-free without a 10–20 min write lock (which would stall the chain).

Tests

test_generational_gc_defers_then_reaps — orphan survives cycle 1, reaped cycle 2.
test_generational_gc_spares_node_committed_during_prune — orphan-this-cycle / live-next-cycle is never deleted (the race repro; fails against the old gc_table).
Updated 3 tree prune tests to the deferred contract (tombstone → advance version → reap).

cargo test -p sentrix-trie: 82 passed. cargo check --workspace -D warnings clean.

Summary by CodeRabbit

Chores
- Version bumped to 2.2.38
Bug Fixes
- Enhanced storage cleanup mechanisms with improved reliability to prevent race conditions during garbage collection cycles, ensuring safer data persistence across multiple maintenance operations.

The periodic trie prune runs on a background thread whose live-set is a snapshot frozen at spawn time. Blocks committed during the multi-minute GC walk write new nodes/values the snapshot never saw; the old gc deleted them as "orphans". #791 split the node/value passes and re-augmented the live-set to narrow the window, but its own comment notes the complete fix is still "one RW txn around walk+delete". The window persisted: testnet val3 hit a "missing node" mid-traversal in create_block, so every time it was proposer the round ate a 20s timeout and the chain crawled to 0.13 blk/s until a restart reloaded the trie. Generational GC: instead of deleting an orphan immediately, the prune TOMBSTONES it (TABLE_TRIE_TOMBSTONES, keyed `disc||hash` -> version) and only deletes it on a LATER prune if it is STILL orphan then. A hash committed during this prune is orphan vs the snapshot, so it's tombstoned this cycle — but next cycle it's a recent live node, so its tombstone is dropped instead of deleting it. The race is closed; the worst-case failure mode is benign (under-deletion = storage grows one extra cycle, never deletion of a live entry). Deletion lags one prune interval (~5000 blocks). The new tombstone table auto-creates on env open (ALL_TABLES), so existing chain.dbs need no migration. No consensus/state change: trie content and state_root are identical; only dead-node deletion TIMING moves. Not the complete single-RW-txn fix, but race-free without holding a 10-20 min write lock (which would stall the chain). Tests: gc_table_generational defers-then-reaps; a node orphan-this-cycle / live-next-cycle is never deleted (the race repro, fails against the old gc_table). Updated three tree prune tests to the deferred contract (tombstone, advance version, reap).

codecov · 2026-06-06T08:04:00Z

Codecov Report

❌ Patch coverage is 99.28058% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/sentrix-trie/src/storage.rs	99.14%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai · 2026-06-06T08:09:10Z

Wondering what really moved? Review this PR in Change Stack to inspect semantic changes, definitions, and references.

📝 Walkthrough

Walkthrough

This PR implements a generational garbage collection mechanism for the Sentrix trie to eliminate race conditions during node and value deletion. Instead of immediately removing orphaned entries, the system now defers deletion across two prune cycles using a tombstone table. The first prune marks orphans as tombstones at the current version; the second prune deletes stale tombstones if the entries remain orphaned. The prune() method now calls gc_nodes_generational, augments the live set to the latest on-disk state, and calls gc_values_generational. Existing tests are updated to account for the one-cycle deferral by running prune() twice with an intervening commit() to advance the version counter.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

sentrix-labs/sentrix#791: The main PR builds on the node-then-value race fix introduced in this PR by implementing the generational tombstone mechanism that complements the augmented live-set approach.
sentrix-labs/sentrix#584: Background pruning dispatches to the updated SentrixTrie::prune() method, so concurrent commits now interact with the new generational GC logic.
sentrix-labs/sentrix#711: Both PRs modify the prune race-avoidance strategy; this PR uses generational tombstones while the retrieved PR augments live roots from the latest on-disk version.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change—implementing generational GC to fix the prune/commit race condition affecting trie node deletion.
Description check	✅ Passed	The description is comprehensive and addresses the template's key sections: Problem, Fix, Safety, and Tests; however, the Scope and Checks sections of the template are not filled out with checkboxes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/trie-generational-gc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/sentrix-trie/src/storage.rs`:
- Around line 404-436: The current reap loop (using tombstoned/live_hashes ->
reap -> later deletes via self.mdbx.delete on tables::TABLE_TRIE_TOMBSTONES and
data_table) can delete hashes that were reintroduced by a concurrent writer; fix
by rechecking liveness under a writer-stable view immediately before performing
deletes: just before iterating clear/reap to call self.mdbx.delete, obtain a
writer-locked/stable snapshot (or re-run the equivalent live_hashes check under
transactional/lock semantics provided by mdbx) and filter the reap list against
that fresh snapshot (i.e., recompute whether each h in reap is still absent from
live_hashes), only then perform the deletes on TABLE_TRIE_TOMBSTONES and
data_table; alternatively, defer deletes of prior-cycle tombstones until they
can be executed inside the same writer-synchronized transaction that confirms
absence.
- Around line 413-415: The current logic maps any non-8-byte tombstone payload
to tv = 0 via v.try_into().map(u64::from_be_bytes).unwrap_or(0), which silently
makes malformed tombstones immediately reapable; instead, detect malformed
tombstone payloads and surface a storage-corruption error to stop GC. Replace
the unwrap_or(0) path with error handling that returns or propagates a
StorageCorruption (or appropriate Result::Err) from the enclosing function (the
prune/scan routine that owns variables tv, v, version, reap, and h), include
contextual info (e.g., key/header h and payload length) in the error, and only
push h into reap when the tombstone value successfully parses and tv < version.
Ensure callers handle the Result so GC aborts on corrupt tombstone payloads.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: d3a6f22c-215a-4c22-9bd1-96cb1eb4461a

📥 Commits

Reviewing files that changed from the base of the PR and between 10b4874 and 71025c6.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock, !**/*.lock

📒 Files selected for processing (4)

Cargo.toml
crates/sentrix-storage/src/tables.rs
crates/sentrix-trie/src/storage.rs
crates/sentrix-trie/src/tree.rs

coderabbitai · 2026-06-06T08:09:13Z

+        self.mdbx
+            .iter_from(tables::TABLE_TRIE_TOMBSTONES, &[], |k, v| {
+                if k.len() == 33 && k[0] == disc {
+                    let mut h = [0u8; 32];
+                    h.copy_from_slice(&k[1..]);
+                    tombstoned.insert(h);
+                    if live_hashes.contains(&h) {
+                        clear.push(h);
+                    } else {
+                        let tv = v.try_into().map(u64::from_be_bytes).unwrap_or(0);
+                        if tv < version {
+                            reap.push(h);
+                        }
+                    }
+                }
+                true
+            })
+            .map_err(|e| SentrixError::StorageError(e.to_string()))?;
+
+        let mut tomb_key = [0u8; 33];
+        tomb_key[0] = disc;
+        for h in clear.iter().chain(reap.iter()) {
+            tomb_key[1..].copy_from_slice(h);
+            self.mdbx
+                .delete(tables::TABLE_TRIE_TOMBSTONES, &tomb_key)
+                .map_err(|e| SentrixError::StorageError(e.to_string()))?;
+        }
+        let deleted = reap.len();
+        for h in &reap {
+            self.mdbx
+                .delete(data_table, h)
+                .map_err(|e| SentrixError::StorageError(e.to_string()))?;
+        }


⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Reaping still races with a same-hash commit.

reap is decided from the frozen live_hashes snapshot and then deleted later in separate write transactions. If a concurrent commit reintroduces the same content-addressed node/value hash after Lines 410-415 but before Line 434 runs, this cycle still deletes it and the newly committed root now points at a missing node/value. The one-cycle tombstone deferral only protects hashes that become live before the next prune starts; it does not protect resurrection during the reaping cycle itself. Please re-check liveness against a writer-stable view immediately before delete, or keep prior-cycle tombstones deferred until reap is coupled to writer synchronization. As per coding guidelines, crates/sentrix-trie/**: CONSENSUS-CRITICAL — suggestions only, no destructive rewrites.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sentrix-trie/src/storage.rs` around lines 404 - 436, The current reap loop (using tombstoned/live_hashes -> reap -> later deletes via self.mdbx.delete on tables::TABLE_TRIE_TOMBSTONES and data_table) can delete hashes that were reintroduced by a concurrent writer; fix by rechecking liveness under a writer-stable view immediately before performing deletes: just before iterating clear/reap to call self.mdbx.delete, obtain a writer-locked/stable snapshot (or re-run the equivalent live_hashes check under transactional/lock semantics provided by mdbx) and filter the reap list against that fresh snapshot (i.e., recompute whether each h in reap is still absent from live_hashes), only then perform the deletes on TABLE_TRIE_TOMBSTONES and data_table; alternatively, defer deletes of prior-cycle tombstones until they can be executed inside the same writer-synchronized transaction that confirms absence.

Source: Coding guidelines

coderabbitai · 2026-06-06T08:09:13Z

+                        let tv = v.try_into().map(u64::from_be_bytes).unwrap_or(0);
+                        if tv < version {
+                            reap.push(h);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail closed on malformed tombstone payloads.

Line 413 turns any non-8-byte tombstone value into tv = 0, which makes that entry immediately reapable on the next prune. In this path, corrupt tombstone state should stop GC, not silently authorize deletion of trie data. Please surface a storage-corruption error instead of defaulting. As per coding guidelines, crates/sentrix-trie/**: CONSENSUS-CRITICAL — suggestions only, no destructive rewrites.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sentrix-trie/src/storage.rs` around lines 413 - 415, The current logic maps any non-8-byte tombstone payload to tv = 0 via v.try_into().map(u64::from_be_bytes).unwrap_or(0), which silently makes malformed tombstones immediately reapable; instead, detect malformed tombstone payloads and surface a storage-corruption error to stop GC. Replace the unwrap_or(0) path with error handling that returns or propagates a StorageCorruption (or appropriate Result::Err) from the enclosing function (the prune/scan routine that owns variables tv, v, version, reap, and h), include contextual info (e.g., key/header h and payload length) in the error, and only push h into reap when the tombstone value successfully parses and tv < version. Ensure callers handle the Result so GC aborts on corrupt tombstone payloads.

Source: Coding guidelines

…residual (#799) Follow-up to #798 addressing two CodeRabbit findings on the generational GC: - Major: a non-8-byte tombstone payload mapped to tv=0, making a corrupt entry immediately reapable (could delete live trie data). Now fail closed — skip reaping malformed tombstones (keep the entry) and warn, instead of defaulting to 0. - Critical (documentation): the reap still has a narrow content-addressed resurrection window inside its own scan+delete. Documented the caller contract (tree.rs re-augments `live` to the latest committed root immediately before each gc pass, collapsing the window from the old multi-minute walk to this method's ms-scale scan+delete) and that the complete elimination needs writer-coupling (walk+delete in one RW txn) — the tracked fix this PR-series deliberately defers to avoid a chain-blocking write lock. This series strictly narrows the race; verify_integrity remains the backstop. Regression test: a malformed tombstone must not authorise deletion (data survives). sentrix-trie: 83 passed.

…800) * fix(trie): race-free offline prune; background prune off by default The durable fix for the recurring trie "missing node" stalls. Root cause (confirmed): the BACKGROUND prune runs on a thread cloning the trie at a frozen version while block apply keeps committing; a node committed or content-addressed-resurfaced during the multi-minute live-set walk is absent from the frozen snapshot and gets deleted as an orphan. Five partial fixes (#711 reload-before-gc, #714 collect_reachable depth, #791 split passes, #798 generational defer) each narrowed but never closed the window — #798 deleted live nodes again in production (val3, h=6300000 prune → "missing node" → propose stall). A truly race-free background prune needs walk+delete in one MDBX RW txn (a chain-blocking write lock for the 10-20 min walk) or refcounting — both deferred. This takes the mechanism-agnostic safe route: eliminate the concurrency. - `SentrixTrie::prune_offline(keep)` — same walk + keep-window as `prune`, minus the racy augment and the generational deferral, using the combined immediate `gc_orphaned_nodes`. Correct only with no concurrent commits, which is guaranteed by running it on a STOPPED node. - `sentrix chain prune [--keep N]` — operator runs it during a maintenance halt (same model as `chain reset-trie` / `verify-deep`). Safe on a single peer: deleting unreachable nodes does not change the state_root (which only commits reachable nodes), so no fork risk — unlike reset-trie. - Background prune is now OFF by default. It only runs with SENTRIX_ENABLE_BACKGROUND_TRIE_PRUNE=1 (and the legacy SENTRIX_DISABLE_TRIE_PRUNE=1 still force-disables). maybe_prune_trie early-returns by default, so the apply path no longer even spawns the racy thread. Trade-off: storage grows between maintenance prunes. Acceptable; correctness over convenience after five automatic-prune failures. Tests: prune_offline reclaims orphans AND the current value survives the prune (the regression guard #798 lacked — a live-node deletion would make the post-prune `get` fail with "missing node"); background_prune_enabled gate is off by default + opt-in semantics. cargo test -p sentrix-trie: 84 passed; -p sentrix-core: 263 passed; cargo check --workspace -D warnings clean. * fix(trie): clippy nonminimal-bool + chain prune fail-closed guards (CodeRabbit) - Clippy: background_prune_enabled uses is_none_or instead of !..is_some_and (nonminimal_bool, -D warnings in CI; cargo check didn't catch it). - chain prune Major guards (CodeRabbit on #800): - Refuse if no persisted trie root at the current height — never let the maintenance prune trigger an init_trie backfill rebuild (different node shape = reset-trie fork class). New Storage::has_persisted_trie_root. - Enforce the offline precondition instead of only printing it: detect a running node via height-stability (sample across >5s poll-persist interval) and fail closed if the chain advanced; override with SENTRIX_ALLOW_ONLINE_PRUNE=1 for rare recovery.

…802) test_c03_pass2_failure_rolls_back_state credits v1 to u64::MAX - reward and relies on the coinbase credit overflowing in Pass 2. It read get_block_reward() (which depends on the reward-fork env vars: VOYAGER_REWARD_V2_HEIGHT / TOKENOMICS_V2_HEIGHT / halving) WITHOUT holding env_test_lock, while sibling tests set those vars under that lock. Under the parallel `cargo test` run (e.g. the report-only coverage job) a concurrent mutation changed `reward` mid-test, the overflow didn't trigger, add_block returned Ok, and unwrap_err panicked — the recurring CI flake that needed re-runs on #795/#798/#800/#801. Acquire env_test_lock at the top of the test so it serializes against the env-mutating tests. Test-only change. sentrix-core suite + clippy --all-targets -D warnings clean.

github-actions Bot enabled auto-merge (squash) June 6, 2026 08:02

coderabbitai Bot reviewed Jun 6, 2026

View reviewed changes

github-actions Bot merged commit d93bbfe into main Jun 6, 2026
20 of 21 checks passed

satyakwok mentioned this pull request Jun 6, 2026

fix(trie): fail-closed on malformed tombstone (CodeRabbit follow-up to #798) #799

Merged

satyakwok deleted the fix/trie-generational-gc branch June 6, 2026 09:50

satyakwok mentioned this pull request Jun 6, 2026

fix(trie): race-free offline prune + background prune off by default #800

Merged

satyakwok mentioned this pull request Jun 6, 2026

test(core): serialize c03 rollback test against reward-env race (CI flake fix) #802

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(trie): generational GC to end the prune/commit missing-node race#798

fix(trie): generational GC to end the prune/commit missing-node race#798
github-actions[bot] merged 1 commit into
mainfrom
fix/trie-generational-gc

satyakwok commented Jun 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

codecov Bot commented Jun 6, 2026

Uh oh!

coderabbitai Bot commented Jun 6, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 6, 2026

Uh oh!

coderabbitai Bot Jun 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

satyakwok commented Jun 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix — generational (deferred) GC

Safety

Tests

Summary by CodeRabbit

Uh oh!

codecov Bot commented Jun 6, 2026

Codecov Report

Uh oh!

coderabbitai Bot commented Jun 6, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

satyakwok commented Jun 6, 2026 •

edited by coderabbitai Bot

Loading