You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With FlatDb.Enabled=true on master, Nethermind intermittently computes a wrong state root and rejects a canonical block with InvalidStateRoot. A non-flat (patricia/HalfPath) control node accepts the same block. This is a consensus divergence specific to the flat-state backend.
Intermittent. An offline full FlatTrieVerifier cross-check on a synced chiado DB (head 21576781, which had processed past the bad block cleanly on a different run) came back clean: 539,247 accounts / 35,390,325 slots, 0 mismatched, 0 missing. A 24h forward-processing soak on two fresh chiado nodes did not re-trigger it. So a synced DB can be divergence-free; the fault must be caught on a fresh sync that hits the right window.
Where it likely is
Flat-backend reads are not hash-verified the way trie-node reads are, so a wrong flat value surfaces only as a silent wrong state root. Static analysis points at the SnapshotCompactor self-destruct / storage-clear merge when folding per-block snapshots into a compacted snapshot:
the storage-slot clear keys on Address, while the storage-trie-node clear keys on the account-path hash and is gated on !isNewAccount, with asymmetric self-destruct-marker semantics (TryAdd(addr, true) for new-account vs [addr] = false for non-new).
the under-tested case is self-destruct-then-recreate within a single compaction window (e.g. a CREATE2 redeploy), for which there is no regression test — SnapshotCompactorTests only covers destroy-with-storage and destroy-new-account.
The per-instance persisted compaction offset (#11756) is likely a trigger, not the cause: it makes compaction-window boundaries node-specific, so the buggy merge path is exercised at block alignments that a deterministic schedule / the patricia control never hit — consistent with "one flat node diverged, the control didn't."
How to reproduce / confirm
chiado forward sync with --FlatDb.Enabled true --FlatDb.VerifyWithTrie true to convert the silent InvalidStateRoot into a precise per-account/per-slot TrieException naming the diverging address/slot — look for FlatStorageTree "Get slot got wrong value … Self destruct it {idx}" and FlatWorldStateScope "Incorrect account …".
Pin the schedule with --FlatDb.CompactionOffset 0 and run --FlatDb.InlineCompaction true to remove compactor/persist timing as a variable.
Signature: a slot-level throw naming a recently self-destructed/recreated contract near the diverging block.
Note: --FlatDb.VerifyWithTrie true cannot be used for a post-snap live soak — the live trie-comparison store can't reload the snap-sync state root once evicted and the node wedges on a TrieNodeException (unrelated to this bug). Use it on a forward sync from genesis/early, or rely on the natural InvalidStateRoot on a snap+forward soak.
Related (NOT duplicates)
FlatDB sync ends up on invalid chain #11353 (closed) — "FlatDB sync ends up on invalid chain": dismissed as testing the wrong schema (HalfPath, not FlatDb), a whole-block import divergence; not this steady-state single-slot merge defect.
Summary
With
FlatDb.Enabled=trueon master, Nethermind intermittently computes a wrong state root and rejects a canonical block withInvalidStateRoot. A non-flat (patricia/HalfPath) control node accepts the same block. This is a consensus divergence specific to the flat-state backend.Evidence
FlatDb.Enabled=true) rejected canonical block 21574961 (0x0ba5905c2295b4b12ed035f8eace55f4ae9c288f55afcdcc7eaf7d6c6b9b9394, 0 txs, 8 non-zero withdrawals): computed root0x78374a9a…vs canonical0x12173c61….FlatTrieVerifiercross-check on a synced chiado DB (head 21576781, which had processed past the bad block cleanly on a different run) came back clean: 539,247 accounts / 35,390,325 slots, 0 mismatched, 0 missing. A 24h forward-processing soak on two fresh chiado nodes did not re-trigger it. So a synced DB can be divergence-free; the fault must be caught on a fresh sync that hits the right window.Where it likely is
Flat-backend reads are not hash-verified the way trie-node reads are, so a wrong flat value surfaces only as a silent wrong state root. Static analysis points at the
SnapshotCompactorself-destruct / storage-clear merge when folding per-block snapshots into a compacted snapshot:Address, while the storage-trie-node clear keys on the account-path hash and is gated on!isNewAccount, with asymmetric self-destruct-marker semantics (TryAdd(addr, true)for new-account vs[addr] = falsefor non-new).SnapshotCompactorTestsonly covers destroy-with-storage and destroy-new-account.The per-instance persisted compaction offset (#11756) is likely a trigger, not the cause: it makes compaction-window boundaries node-specific, so the buggy merge path is exercised at block alignments that a deterministic schedule / the patricia control never hit — consistent with "one flat node diverged, the control didn't."
How to reproduce / confirm
--FlatDb.Enabled true --FlatDb.VerifyWithTrie trueto convert the silentInvalidStateRootinto a precise per-account/per-slotTrieExceptionnaming the diverging address/slot — look forFlatStorageTree"Get slot got wrong value … Self destruct it {idx}" andFlatWorldStateScope"Incorrect account …".--FlatDb.CompactionOffset 0and run--FlatDb.InlineCompaction trueto remove compactor/persist timing as a variable.Related (NOT duplicates)
InvalidStateRoot: an AuRa archive-sync finalization bug (PR fix: skip AuRa finalization startup walk on post-merge chains #11306); does not enable FlatDb.FlatInTrie,--FlatDb.ImportFromPruningTrieState=truetriggers on partway synced DB #11418, WithFlatInTrie, a restart duringSnap State Rangesgreatly increases RAM usage and OOMs #11442, During FlatDB sync, Nethermind logsUnable to find beacon header at height xxx. This is unexpected, forcing a new beacon sync.#11447, testing_commitBlockV1 does not persist committed block state — every commit after the first fails #11979 — distinct FlatDb restart/durability/import/test defects.Do not allowlist
Real consensus divergence on the flat backend; must be fixed before FlatDb ships. The smoke
SyncCNWSF/Sync*Flattests correctly catch it.