perf: fix consensus stability, CPU saturation, and memory growth on idle chains by 0xZunia · Pull Request #75 · Basalt-Foundation/basalt

0xZunia · 2026-03-28T10:13:29Z

Summary

Fixes critical consensus, performance, and memory issues discovered through validator log analysis on a 4-node Docker devnet. The chain previously halted permanently after ~1.5 hours due to validator set divergence, and CPU saturated at 100% on idle chains.

Before: chain halts at ~3,300 blocks, 100% CPU, ~50 MB/hour RAM growth
After: 36+ hours stable, healthy CPU, bounded RAM

Root Causes & Fixes

Consensus Stability

COMMIT bitmap race condition — When COMMIT quorum (3/4) was reached, the aggregate was built immediately; the 4th validator's vote (on a concurrent TCP thread) was excluded from the bitmap. Over thousands of blocks, the same validators consistently lost the race, got slashed below the inactivity threshold, and were permanently deactivated. Different leaders had different placeholder BLS key states, causing each node to slash different validators — producing divergent validator sets that could never reconverge, permanently halting the chain.
- Re-aggregate COMMIT QC when late votes arrive after quorum
- Accept updated COMMIT aggregates on non-leader nodes
- Wait up to 15s for all validator identities (BLS keys) to resolve via P2P handshake before starting consensus
- Log BLS key mismatches that were previously completely silent
- Re-activate validators when stake recovers above minimum after slashing
- Guard block proposals with validator set membership check

CPU Saturation

Unnecessary sync on 1-block gaps — Every block announce from 3 peers triggered TrySyncFromPeers(), each calling Fork() → ComputeStateRoot() (expensive Merkle trie rehashing). 12+ wasted state root computations per block.
Double message serialization — BroadcastConsensusMessage serialized for dedup hash, then BroadcastToAll serialized again for sending.
Fork for already-applied blocks — Sync ApplyBatch called Fork() even when all blocks were already finalized by consensus.
TrieStateDb dirty tracking never cleared — _dirtyStorageKeys and _dirtyAccounts on the TrieStateDb layer grew unboundedly (the FlatStateDb cleanup never forwarded to the inner trie), causing escalating GC pressure → 100% CPU after hours.
Error handler spin risks — ReconnectLoop and TCP AcceptLoop had catch blocks that continued without delay on persistent errors.

Memory Growth

RocksDB unbounded block cache — No block cache limit was configured; native memory grew ~50 MB/hour as SST index/filter blocks scaled with on-disk data. Added 64 MB shared LRU cache with CacheIndexAndFilterBlocks.
FlatStateDb storage cache unbounded — _storageCache grew monotonically with no eviction. Added 500K entry hard cap with full eviction.
Sync path missing dirty tracking cleanup — ApplyBatch never called ClearDirtyTracking()/CompactDeletedSets() on the forked state before swap.
PruneInactivePeers never called — Disconnected peers accumulated forever.
ProcessUnbonding never called — Unbonding queue grew with every unstake.
SlashingEngine history unbounded — Capped at 10K entries.
EpisubService timer crash risk — Unhandled exception in cleanup callback would terminate the process.

Test plan

dotnet build — 0 warnings, 0 errors
dotnet test — all tests pass across 16 projects
4-validator Docker devnet: 36+ hours uptime, stable consensus
First epoch transition (block 100): all 4 validators survive, quorum 3/3
CPU stable (no 100% saturation)
RAM bounded (no 50 MB/hour growth)
No "Vote from non-validator" or "Invalid aggregate signature" warnings
No "BLS key mismatch" warnings (identity wait works)

ComputeStateRoot was unconditionally rewriting all 9 system contract accounts into the world trie on every call — even with zero state changes. Each rewrite triggered BLAKE3 rehashing of the entire trie path. Combined with MerklePatriciaTrie.Put never short-circuiting on identical values, this caused ~50+ unnecessary hash computations per block across all validators, explaining the 100% CPU with zero transactions. CPU fixes: - Skip unchanged StorageRoot in ComputeStateRoot (TrieStateDb) - Short-circuit identical leaf values in trie Put (MerklePatriciaTrie) - Cache BLS public key at startup in both consensus modes - Skip TWAP carry-forward and limit order matching when 0 DEX pools exist - Add 1s cooldown to fire-and-forget sync in pipelined consensus loop - Cache ConnectedPeers/ConnectedCount in PeerManager (O(1) vs O(n) LINQ) - Cap RocksDB background threads to prevent CPU saturation on small VMs - Add SpinWait to CAS loops in PeerInfo and ResourceLimiter Memory fixes: - Clear BlockData + signature lists in CleanupFinalizedRounds - Remove .ToArray() from _proposalsByView cleanup (O(n) alloc per block) - Reduce Episub seen-message cache 200K→50K, TTL 120s→60s - Add proactive 30s cleanup timer to Episub (was only reactive on overflow)

- FlatStateDb: add ClearDirtyTracking() and CompactDeletedSets() to prevent _dirtyStorageKeys, _dirtyAccounts, _deletedStorage, and _deletedAccounts from growing unboundedly on the canonical instance. Called after each block finalization in BlockApplier. - TrieStateDb: cache ComputeStateRoot() result and skip recomputation when no writes have occurred since last call. Clear _storageTries after flushing storage roots to release accumulated trie node refs. - BlockBuilder: early return in ApplyDexSettlement when pool count is 0, avoiding TWAP/order operations and allocations on every empty block. - Mempool: early return with static empty list when no pending txs or DEX intents, avoiding container allocations on every block proposal. - WebSocketHandler: skip broadcast serialization when no clients are connected. - RocksDbStore: reduce write buffer sizes (~1.3GB → ~320MB total) and scale compaction threads to core count instead of hardcoded 2+1.

…scade Fix COMMIT aggregate bitmap race condition that caused systematic inactivity slashing of validators. When COMMIT quorum (3/4) was reached, the aggregate was built immediately — the 4th validator's vote, being processed on a concurrent TCP thread, was excluded from the bitmap. Over thousands of blocks, the same validators consistently lost the race due to thread scheduling, dropping below the 50% inactivity threshold and getting slashed every epoch until deactivated. - Re-aggregate COMMIT QC when late votes arrive after quorum - Accept updated COMMIT aggregates on non-leader nodes - Handle bitmap-only updates for already-finalized blocks - Re-activate validators when stake recovers above minimum - Guard block proposals with validator set membership check - Add storage cache eviction at 500K entries (was unbounded) - Clean dirty/deleted tracking on sync path (was consensus-only) - Call PruneInactivePeers in ReconnectLoop (was never called) - Drain unbonding queue per finalized block (was never called) - Cap slashing history at 10K entries (was unbounded) - Throttle RPC sync polling with 500ms minimum delay

The root cause of validator set divergence (and permanent chain halt) is that consensus starts before all P2P handshakes complete. Leaders whose validator set still has placeholder BLS keys silently drop votes from unresolved peers — no log, no warning. Different leaders have different placeholder states (depending on Docker container startup order), so different blocks end up with different commit bitmaps. At epoch boundary, each node computes different inactivity slashing, producing divergent validator sets that can never reconverge. - Wait up to 15s for all validator identities to resolve via handshake before entering the consensus loop - Add warning log when votes are dropped due to BLS key mismatch (previously completely silent — made diagnosis impossible)

…hains Block announce messages from peers trigger TrySyncFromPeers even for 1-block gaps. With 3 peers broadcasting every finalized block, this causes ~3 Fork+ComputeStateRoot cycles per block per node — each involving expensive Merkle trie rehashing, TCP round-trips, and fork state creation. By the time sync fetches the block, consensus has already finalized it, so the work is entirely wasted. On a 4-validator devnet, this wastes ~12 unnecessary state root computations per block, saturating CPU at 100% on idle chains. - Only trigger sync from block announces when gap > 1 block - Only trigger sync from OnBehindDetected when gap > 1 block - 1-block gaps resolve naturally via consensus (~1s latency)

BroadcastConsensusMessage serialized the message once for dedup hashing, then BroadcastToAll serialized it again for sending. With 3 aggregate QCs per block × 4 validators, this wasted 12 serializations per block — each involving MessageCodec encoding and byte[] allocation. - Add BroadcastRawToAll that takes pre-serialized bytes - Reuse the dedup-serialized bytes for the actual broadcast

…pplied When consensus finalizes a block before the sync response arrives, ApplyBatch still called stateDbRef.Fork() — which triggers TrieStateDb.ComputeStateRoot() with full Merkle trie rehashing — only to discover all blocks are already applied and discard the fork. On a 4-validator idle chain, this wasted ~3 state root computations per block per node (one per sync attempt from each peer). - Add fast path in ApplyBatch: check if all blocks are already in the chain before forking. If so, just record epoch data and return. - Add early-exit in Mempool.PruneStale when mempool is empty, avoiding lock acquisition, list allocation, and trie reads on every block.

TrieStateDb._dirtyStorageKeys and _dirtyAccounts were NEVER cleared. The interface default ClearDirtyTracking() was a no-op, and FlatStateDb only cleared its own sets without forwarding to the inner TrieStateDb. Every block added entries: SetStorage → _dirtyStorageKeys.Add(), SetAccount → _dirtyAccounts.Add(). Over hours, these HashSets grew unboundedly (3,600+ entries/hour on idle chains), creating GC pressure that escalated to 100% CPU as Gen 2 collections became frequent. - Add ClearDirtyTracking() override to TrieStateDb - Forward FlatStateDb.ClearDirtyTracking() to inner TrieStateDb

RocksDB native memory grew ~50 MB/hour because no block cache limit was set. Each column family created its own unbounded cache, and as SST files accumulated (trie nodes are never pruned), index/filter blocks consumed increasing amounts of native memory outside the .NET heap. - Add shared 64 MB LRU block cache across all 8 column families - Enable CacheIndexAndFilterBlocks to keep index/filter in the bounded cache instead of unbounded native allocations - Add delay in ReconnectLoop and AcceptLoop error handlers to prevent tight spin on persistent socket errors - Add catch in EpisubService timer callback to prevent unhandled ThreadPool exception from terminating the process

Copilot

Pull request overview

This PR addresses multiple validator-reported issues causing consensus divergence/halts, idle-chain CPU saturation, and unbounded memory growth by improving COMMIT QC aggregation behavior, reducing unnecessary expensive state work, and adding cache/collection bounds plus cleanup paths across node/network/storage components.

Changes:

Fixes consensus stability around COMMIT bitmap races (re-aggregating/accepting updated COMMIT QCs, waiting for validator identities before consensus start, and guarding proposal eligibility).
Reduces idle-chain CPU load by avoiding redundant sync/fork/state-root work and cutting repeated allocations/serialization in gossip, peer management, and block production paths.
Bounds memory growth via RocksDB cache configuration, state DB dirty/tombstone compaction, gossip cache caps + cleanup, and other long-lived collection limits.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/Basalt.Consensus.Tests/BasaltBftTests.cs	Updates consensus test expectations to allow updated COMMIT aggregates.
src/storage/Basalt.Storage/TrieStateDb.cs	Adds cached state root + clears storage trie cache; introduces dirty tracking cleanup.
src/storage/Basalt.Storage/Trie/MerklePatriciaTrie.cs	Skips rehash/write when leaf value is unchanged.
src/storage/Basalt.Storage/StateDbRef.cs	Forwards new state-db maintenance APIs (dirty tracking / deletion compaction).
src/storage/Basalt.Storage/RocksDb/RocksDbStore.cs	Tunes RocksDB options and introduces shared LRU block cache.
src/storage/Basalt.Storage/IStateDatabase.cs	Adds default no-op APIs for clearing dirty tracking and compacting deletion sets.
src/storage/Basalt.Storage/FlatStateDb.cs	Adds eviction/cleanup paths to bound caches and tracking sets.
src/node/Basalt.Node/NodeCoordinator.cs	Adds identity-wait before consensus, sync cooldowns, >1-block gap sync gating, proposal membership guard, proposal pruning iteration optimizations, peer pruning, and updated-bitmap handling.
src/node/Basalt.Node/BlockSyncService.cs	Adds minimum polling delay to reduce sync pressure.
src/node/Basalt.Node/BlockApplier.cs	Adds canonical/fork cleanup hooks, sync fast-path for already-applied blocks, and unbonding processing.
src/network/Basalt.Network/Transport/TcpTransport.cs	Adds delay on accept-loop socket errors to avoid tight spins.
src/network/Basalt.Network/PeerManager.cs	Adds cached connected-peer snapshot + count to reduce repeated LINQ/allocation overhead.
src/network/Basalt.Network/PeerInfo.cs	Adds SpinWait in CAS loop to reduce CPU burn under contention.
src/network/Basalt.Network/GossipService.cs	Avoids double-serialization by broadcasting pre-serialized bytes.
src/network/Basalt.Network/Gossip/EpisubService.cs	Tightens caps/TTLs and adds proactive timer-based cleanup plus tier broadcast optimizations.
src/execution/Basalt.Execution/VM/Sandbox/ResourceLimiter.cs	Adds SpinWait in CAS loop to reduce CPU burn under contention.
src/execution/Basalt.Execution/Mempool.cs	Adds empty fast paths for pending retrieval and pruning.
src/execution/Basalt.Execution/BlockBuilder.cs	Skips DEX work when no pools exist and reuses DEX state where possible.
src/consensus/Basalt.Consensus/Staking/StakingState.cs	Re-activates validators when stake recovers above minimum.
src/consensus/Basalt.Consensus/Staking/SlashingEngine.cs	Caps in-memory slashing history.
src/consensus/Basalt.Consensus/PipelinedConsensus.cs	Caches BLS public key and cleans up finalized rounds to release memory sooner.
src/consensus/Basalt.Consensus/BasaltBft.cs	Adds COMMIT bitmap update logic, caches BLS public key, and logs identity/key mismatches.
src/api/Basalt.Api.Rest/WebSocketEndpoint.cs	Skips per-block WS broadcast allocations when no clients are connected.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T10:18:12Z

src/execution/Basalt.Execution/BlockBuilder.cs

+    private static readonly List<TransactionReceipt> s_emptyReceipts = [];
+
    public List<TransactionReceipt> ApplyDexSettlement(IStateDatabase stateDb, BlockHeader blockHeader)
    {
-        var allReceipts = new List<TransactionReceipt>();
+        // Fast path: skip all DEX operations when no pools exist.
+        // On an idle chain this avoids allocating DexState objects, reading pool state,
+        // and running TWAP/order logic on every block.
+        var poolCheck = new DexState(stateDb);
+        if (poolCheck.GetPoolCount() == 0)
+            return s_emptyReceipts;


Returning a shared mutable list (s_emptyReceipts) is risky for the same reason as the mempool empty-list fast path: if any caller mutates the returned List, it will affect all other callers. Consider returning a new empty List each time, or change the return type to IReadOnlyList and return Array.Empty()/[] for empty.

Copilot · 2026-03-28T10:18:13Z

src/storage/Basalt.Storage/RocksDb/RocksDbStore.cs

+        // Shared LRU block cache across all column families.
+        // Without this, RocksDB creates an unbounded per-CF cache whose native memory
+        // grows proportionally to on-disk data (SST index/filter blocks, data blocks).
+        var blockCache = RocksDbSharp.Native.Instance.rocksdb_cache_create_lru(new UIntPtr(64 * 1024 * 1024)); // 64 MB shared
+
+        var defaultTableOptions = new BlockBasedTableOptions()
+            .SetBlockCache(blockCache)
+            .SetCacheIndexAndFilterBlocks(true);
+
+        var defaultOptions = new ColumnFamilyOptions()
+            .SetBlockBasedTableFactory(defaultTableOptions);
+


The LRU block cache is created via rocksdb_cache_create_lru but never destroyed in Dispose(). Since this is native memory, repeated RocksDbStore instances (tests, multiple nodes in-process, etc.) can leak until process exit. Consider storing the cache handle in a field and calling rocksdb_cache_destroy in Dispose(), or using a RocksDbSharp-managed cache wrapper if available.

Copilot · 2026-03-28T10:18:13Z

src/network/Basalt.Network/Gossip/EpisubService.cs

+        // Proactive TTL cleanup every 30 seconds — prevents unbounded cache growth
+        // even when message rate stays below the reactive cap threshold.
+        _cleanupTimer = new Timer(_ =>
+        {
+            if (Interlocked.CompareExchange(ref _cleanupRunning, 1, 0) != 0)
+                return;
+            try { CleanupSeenMessages(); }
+            catch (Exception ex)
+            {
+                // Swallow to prevent process termination from unhandled ThreadPool exception.
+                // Timer will retry in 30 seconds.
+                _logger.LogWarning(ex, "Error during Episub message cache cleanup");
+            }
+            finally { Interlocked.Exchange(ref _cleanupRunning, 0); }
+        }, null, TimeSpan.FromSeconds(30), TimeSpan.FromSeconds(30));


_cleanupTimer is created but never disposed/stopped. Since NodeCoordinator.StopAsync doesn’t dispose EpisubService, this timer will keep the service (and its dictionaries/logger) alive and continue running after shutdown, which can cause background work after stop and resource leaks. Consider implementing IDisposable/IAsyncDisposable on EpisubService to dispose the timer, and have NodeCoordinator.StopAsync dispose EpisubService (or expose a Stop method).

Copilot · 2026-03-28T10:18:13Z

src/node/Basalt.Node/NodeCoordinator.cs

+        // the commit bitmap. Different leaders may have different placeholder states, leading
+        // to divergent inactivity slashing calculations at epoch boundaries — a fatal
+        // consistency violation that permanently splits the network.
+        var expectedPeers = _config.Peers.Length;


expectedPeers is declared but never used, which should trigger a compiler warning and adds noise to the startup logic. Remove it, or use it in the identity-resolution logging/threshold if that was the intent.

Suggested change

var expectedPeers = _config.Peers.Length;

Copilot · 2026-03-28T10:18:14Z

src/execution/Basalt.Execution/Mempool.cs

+    private static readonly List<Transaction> s_emptyTxList = [];
+
    public List<Transaction> GetPending(int maxCount, IStateDatabase? stateDb = null)
    {
        lock (_lock)
        {
+            // Fast path: empty mempool — avoid all allocations
+            if (_orderedEntries.Count == 0)
+                return s_emptyTxList;
+


Returning a shared mutable List (s_emptyTxList) is unsafe: callers can modify it (Add/Clear/etc.) and that mutation will be observed by all future callers, potentially corrupting mempool behavior across threads. Prefer returning a new empty List each time, or switch the API to return IReadOnlyList/IReadOnlyCollection and return Array.Empty()/[] for the empty case.

Copilot · 2026-03-28T10:18:14Z

src/execution/Basalt.Execution/Mempool.cs

+        // Fast path: skip when mempool is empty (common on idle chains).
+        // Avoids lock acquisition, list allocation, and trie reads.
+        if (_transactions.Count == 0 && _dexIntentTransactions.Count == 0)
+            return 0;


This "fast path" reads _transactions.Count / _dexIntentTransactions.Count outside the _lock, but those dictionaries are mutated under the same lock elsewhere. Concurrent reads/writes on Dictionary are not thread-safe and can lead to data races or undefined behavior. Move the emptiness check inside the existing lock (or track counts via atomic fields) to keep the mempool’s synchronization consistent.

…e state - Remove shared mutable static empty lists in BlockBuilder and Mempool that could be corrupted if any caller mutated the returned List - Store RocksDB LRU block cache handle and destroy in Dispose() to prevent native memory leak across store instances - Make EpisubService IDisposable to stop cleanup timer on shutdown, dispose from NodeCoordinator.StopAsync - Move Mempool.PruneStale emptiness check inside lock to fix data race on Dictionary.Count reads concurrent with mutations - Remove unused expectedPeers variable in RunConsensusLoop

0xZunia added 9 commits March 23, 2026 12:50

0xZunia requested a review from Copilot March 28, 2026 10:13

Copilot started reviewing on behalf of 0xZunia March 28, 2026 10:14 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

0xZunia merged commit 3b06236 into main Mar 29, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: fix consensus stability, CPU saturation, and memory growth on idle chains#75

perf: fix consensus stability, CPU saturation, and memory growth on idle chains#75
0xZunia merged 10 commits intomainfrom
perf/fix-cpu-memory-hotpaths

0xZunia commented Mar 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

0xZunia commented Mar 28, 2026

Summary

Root Causes & Fixes

Consensus Stability

CPU Saturation

Memory Growth

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants