Skip to content

trie/archiver: streaming archival + pathdb journal consistency#567

Merged
gballet merged 4 commits intogballet:archival-commandfrom
tellabg:archival-streaming-fix
Mar 11, 2026
Merged

trie/archiver: streaming archival + pathdb journal consistency#567
gballet merged 4 commits intogballet:archival-commandfrom
tellabg:archival-streaming-fix

Conversation

@tellabg
Copy link
Copy Markdown

@tellabg tellabg commented Feb 22, 2026

Summary

Two fixes for archive generate:

1. Streaming subtree archival (fix OOM)

The previous recursive approach loaded the entire trie into memory before archiving any subtree. For large storage tries (contracts with millions of slots), this caused OOM kills.

Fix: NodeIterator-based streaming approach:

  • probeHeight(): bounded raw DB reads, discards decoded nodes
  • collectSubtree(): materializes only the bounded height-3 subtree
  • Memory: O(iterator_stack + current_subtree) vs O(entire_trie)

Result: RSS stable at 2 MB (vs OOM at 22 GB). ~3h on Hoodi.

2. Flush diff layers + re-journal (fix block import)

After archiving, geth import failed with:

  • unknown ancestor: journal deleted, chain head lost
  • out-order append: state history freezer gaps
  • triedb layer missing: re-journal used stale root

Fix:

  • Flush diff layers via Commit() before archiving
  • Re-open fresh triedb after archiving for clean journal
  • No journal deletion, no DisableStateHistory()

Result: 1020 blocks imported successfully after archiving (~265 mgasps).

Test results (Hoodi, 22 GB RAM)

  • Generate: 105,292 subtrees, 221,455 leaves, 15 MB archive, 3h16m, RSS 2 MB
  • Import: 1020 blocks in 2m04s, expired nodes resolved from archive

Replace the recursive approach that loaded the entire trie into memory
with a streaming NodeIterator-based approach:

- processTrie now uses NodeIterator to walk the trie node-by-node
- probeHeight reads nodes from raw DB, computes height bounded at 3,
  and discards decoded nodes immediately (no in-memory trie buildup)
- collectSubtree only materializes the bounded height-3 subtree being
  archived (at most ~4096 nodes)
- Memory usage: O(iterator_stack) + O(current_subtree) instead of
  O(entire_trie)

This fixes OOM kills on large storage tries (e.g. contracts with
millions of storage slots) where the previous approach would load
all nodes and subtreeInfo into memory before archiving any of them.
Instead of deleting the pathdb journal after archive generation (which
breaks the chain head and prevents block imports), properly integrate
with pathdb:

1. Open triedb with pathdb support (not just raw KV)
2. Disable state history freezer (avoid append gaps)
3. Flush all diff layers to disk via Commit() before archiving
4. After archiving, re-journal the pathdb state (disk layer only)

This ensures geth can restart cleanly after archiving and continue
importing blocks without 'unknown ancestor' errors.
@tellabg tellabg requested a review from gballet as a code owner February 22, 2026 13:22
@tellabg tellabg force-pushed the archival-streaming-fix branch from 2e4cbf9 to 2dce6ab Compare February 26, 2026 20:29
@tellabg tellabg force-pushed the archival-streaming-fix branch from 2dce6ab to 8447505 Compare February 26, 2026 20:34
Comment thread trie/archiver.go Outdated
Comment thread trie/archiver.go
// height == 3: collect and archive this subtree immediately.
info := a.collectSubtree(owner, path, hash)
if info == nil {
continue
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinda weird that info is nil and we just silently skip it

@gballet gballet merged commit a37814d into gballet:archival-command Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants