Skip to content

perf(l1): move SST files into temp DB during snap sync ingest#6532

Open
iovoid wants to merge 2 commits intomainfrom
perf/snap-sync-move-files-ingest
Open

perf(l1): move SST files into temp DB during snap sync ingest#6532
iovoid wants to merge 2 commits intomainfrom
perf/snap-sync-move-files-ingest

Conversation

@iovoid
Copy link
Copy Markdown
Contributor

@iovoid iovoid commented Apr 27, 2026

Motivation

During storage insertion in snapsync, we use substantially more disk.

Description

Uses RocksDB move-based ingestion to reduce peak usage.

@iovoid iovoid requested a review from a team as a code owner April 27, 2026 12:28
@github-actions github-actions Bot added L1 Ethereum client performance Block execution throughput and performance in general labels Apr 27, 2026
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

The changes look correct and represent a sensible performance optimization for snap sync operations.

Summary

  • File: crates/networking/p2p/sync/snap_sync.rs
  • Change: Replacing SST file copying with atomic moves during account and storage ingestion

Review Points

  1. Performance: Using set_move_files(true) eliminates redundant disk I/O by avoiding copies of large SST files during ingestion. This is significant for snap sync where datasets can be hundreds of GB.

  2. Safety: The approach is safe because:

    • The comment correctly documents the prerequisite (same datadir/filesystem) for atomic rename to work
    • Both insert_accounts (line 1199) and insert_storages (line 1309) operate on temporary ingestion databases where the SST files are ephemeral
    • If ingestion fails, RocksDB maintains consistency and the files remain in their original location
  3. Error Handling: Properly preserved - the map_err conversion to SyncError::RocksDBError remains intact in both locations.

  4. Code Quality: The comments explaining why this is safe (same datadir assumption) are valuable for future maintainers.

Minor Suggestion
Consider adding a defensive debug assertion or log warning if the ingest operation fails with a cross-device error, which would indicate the same-datadir assumption was violated:

db.ingest_external_file_opts(&ingest_opts, file_paths)
    .map_err(|err| {
        // Log filesystem boundary violations for debugging
        if err.to_string().contains("cross-device") {
            warn!("SST ingest failed: snapshot and temp DB not on same filesystem");
        }
        SyncError::RocksDBError(err.into_string())
    })?;

Otherwise, LGTM - this is a straightforward disk space and I/O optimization.


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Here is my review of PR #6532:


Review: perf(l1): move SST files into temp DB during snap sync ingest

Summary

A minimal, focused change: swaps ingest_external_file for ingest_external_file_opts with set_move_files(true) in both insert_accounts and insert_storages. The intent is to use a rename(2) syscall instead of a full file copy, halving peak disk usage during snap sync ingestion.


Correctness & Safety Concerns

Filesystem assumption is load-bearing but unenforced

The comment states: "the snapshot dir and the temp DB live under the same datadir, so rename succeeds". This is a runtime assumption with no enforcement. If a user configures the snapshot directory on a separate mount point or filesystem from the data directory, rename(2) will fail with EXDEV. RocksDB's behavior on a cross-device rename depends on the version — some fall back to copy+delete, others surface the error.

This should either be validated at startup (assert same device via stat(2) / std::fs::metadata(...).st_dev()) or at minimum the assumption should be guarded somewhere visible, not just in a comment.

Idempotency / retry safety regression

With the original copy semantics, a partial failure (e.g., after some files are ingested but the process crashes) leaves all source SST files intact — the sync can be retried cleanly.

With move semantics, once RocksDB renames a file into the temp DB, it no longer exists at the source path. If ingestion then fails (RocksDB error mid-batch, disk full, etc.) and the temp DB is discarded during cleanup, those SST files are gone. The sync would need to re-request that data from peers from scratch. This is likely acceptable (snap sync already handles full restarts), but the error path should be confirmed to handle this gracefully — particularly that partial temp-DB state is cleaned up and the sync restarts from a consistent point.

let mut is correctset_move_files takes &mut self, so the binding needs to be mutable.


Minor Points

Duplicated option construction — Both insert_accounts and insert_storages contain identical boilerplate for constructing IngestExternalFileOptions. A one-liner helper (fn move_ingest_opts() -> rocksdb::IngestExternalFileOptions) would avoid drift if options ever need to change again. This is a minor style note, not a blocker.

Error message qualitySyncError::RocksDBError(err.into_string()) loses the call-site context. Already the case before this PR, but worth noting for future observability work — if the move fails for the EXDEV reason above, the error message will be opaque.


Verdict

The optimization is sound under the stated assumption, and the performance benefit (no double-copy of potentially large leaf data) is real. The main concern is the unvalidated same-filesystem assumption — if that can break in practice, it will produce a hard-to-diagnose sync failure. Before merging, confirm:

  • That the snapshot path and DB path are guaranteed to resolve to the same device in all supported configurations.
  • That the error/cleanup path after a failed ingest properly discards any partial temp DB state so sync can restart cleanly.

If both are confirmed, the change is good to merge.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 27, 2026

Greptile Summary

This PR switches RocksDB SST file ingestion in insert_accounts and insert_storages from copy-based (ingest_external_file) to move-based (ingest_external_file_opts with set_move_files(true)) semantics, reducing peak disk usage during snap sync by avoiding a second on-disk copy of leaf data.

  • The move is destructive: once ingest_external_file_opts succeeds, the source SST files no longer exist in the snapshot directory. If the process crashes or errors after ingest but before trie generation completes, any retry will find an empty snapshot directory and silently compute an incorrect trie. If snap sync always rebuilds the snapshot directory from scratch on restart this is benign, but that assumption is not enforced or documented in these functions.

Confidence Score: 3/5

Safe to merge only if snap sync is guaranteed to fully rebuild the snapshot directory on every restart; otherwise risks silent trie corruption on crash.

Two P1 findings (one per ingest site) sharing the same root cause: move semantics make the operation non-idempotent, so a crash between ingest and trie completion could leave snap sync in an unrecoverable state. Score is below the P1 ceiling because both affected code paths are on the critical sync route.

crates/networking/p2p/sync/snap_sync.rs — both insert_accounts and insert_storages ingest sites

Important Files Changed

Filename Overview
crates/networking/p2p/sync/snap_sync.rs Switches RocksDB SST ingest from copy to move semantics in both insert_accounts and insert_storages, reducing peak disk usage; introduces a crash-safety concern where the snapshot source files are irreversibly consumed before trie generation completes.

Sequence Diagram

sequenceDiagram
    participant SD as Snapshot Dir
    participant RDB as Temp RocksDB
    participant Trie as Trie Builder

    Note over SD,Trie: insert_accounts / insert_storages
    SD->>RDB: ingest_external_file_opts (move_files=true)
    Note over SD: SST files now GONE from snapshot dir
    RDB->>Trie: full_iterator / snapshot.raw_iterator
    Trie->>Trie: trie_from_sorted_accounts_wrap
    Note over SD,RDB: If crash here → snapshot dir empty on retry
    RDB-->>SD: drop(db)
    SD->>SD: remove_dir_all(snapshot_dir)
    RDB->>RDB: remove_dir_all(temp_db_dir)
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/networking/p2p/sync/snap_sync.rs
Line: 1202-1205

Comment:
**Move loses idempotency on crash/retry**

With `set_move_files(true)`, the SST files are *destructively consumed* from the snapshot directory during ingest. If the process crashes or returns an error anywhere after the ingest call but before the function completes (e.g., a RocksDB iterator failure on line 1208, a trie-generation error on line 1217, or even an OOM), the snapshot directory will be empty on the next attempt and `ingest_external_file_opts` will ingest nothing — silently producing an empty/incorrect trie. With the previous copy-based ingest, the source SST files remained intact and any restart could re-ingest correctly.

If snap sync is designed to always restart from scratch (i.e., rebuilding the snapshot directory before this function is called), this is not a problem. But if partial restarts are possible, this move breaks crash safety.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/networking/p2p/sync/snap_sync.rs
Line: 1312-1315

Comment:
**Same issue for storage ingest crash safety**

Same crash-safety concern as in `insert_accounts`: once `ingest_external_file_opts` with `set_move_files(true)` succeeds, the SST files are gone from `account_storages_snapshots_dir`. Any failure between ingest and successful completion means the data is unrecoverable from the snapshot directory on retry.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "docs: add changelog entry for #6532" | Re-trigger Greptile

Comment on lines +1202 to 1205
let mut ingest_opts = rocksdb::IngestExternalFileOptions::default();
ingest_opts.set_move_files(true);
db.ingest_external_file_opts(&ingest_opts, file_paths)
.map_err(|err| SyncError::RocksDBError(err.into_string()))?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Move loses idempotency on crash/retry

With set_move_files(true), the SST files are destructively consumed from the snapshot directory during ingest. If the process crashes or returns an error anywhere after the ingest call but before the function completes (e.g., a RocksDB iterator failure on line 1208, a trie-generation error on line 1217, or even an OOM), the snapshot directory will be empty on the next attempt and ingest_external_file_opts will ingest nothing — silently producing an empty/incorrect trie. With the previous copy-based ingest, the source SST files remained intact and any restart could re-ingest correctly.

If snap sync is designed to always restart from scratch (i.e., rebuilding the snapshot directory before this function is called), this is not a problem. But if partial restarts are possible, this move breaks crash safety.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/sync/snap_sync.rs
Line: 1202-1205

Comment:
**Move loses idempotency on crash/retry**

With `set_move_files(true)`, the SST files are *destructively consumed* from the snapshot directory during ingest. If the process crashes or returns an error anywhere after the ingest call but before the function completes (e.g., a RocksDB iterator failure on line 1208, a trie-generation error on line 1217, or even an OOM), the snapshot directory will be empty on the next attempt and `ingest_external_file_opts` will ingest nothing — silently producing an empty/incorrect trie. With the previous copy-based ingest, the source SST files remained intact and any restart could re-ingest correctly.

If snap sync is designed to always restart from scratch (i.e., rebuilding the snapshot directory before this function is called), this is not a problem. But if partial restarts are possible, this move breaks crash safety.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do always restart from scratch, so not a problem.

Comment on lines +1312 to 1315
let mut ingest_opts = rocksdb::IngestExternalFileOptions::default();
ingest_opts.set_move_files(true);
db.ingest_external_file_opts(&ingest_opts, file_paths)
.map_err(|err| SyncError::RocksDBError(err.into_string()))?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Same issue for storage ingest crash safety

Same crash-safety concern as in insert_accounts: once ingest_external_file_opts with set_move_files(true) succeeds, the SST files are gone from account_storages_snapshots_dir. Any failure between ingest and successful completion means the data is unrecoverable from the snapshot directory on retry.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/sync/snap_sync.rs
Line: 1312-1315

Comment:
**Same issue for storage ingest crash safety**

Same crash-safety concern as in `insert_accounts`: once `ingest_external_file_opts` with `set_move_files(true)` succeeds, the SST files are gone from `account_storages_snapshots_dir`. Any failure between ingest and successful completion means the data is unrecoverable from the snapshot directory on retry.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do always restart from scratch, so not a problem.

@github-actions
Copy link
Copy Markdown

Lines of code report

Total lines added: 4
Total lines removed: 0
Total lines changed: 4

Detailed view
+------------------------------------------------+-------+------+
| File                                           | Lines | Diff |
+------------------------------------------------+-------+------+
| ethrex/crates/networking/p2p/sync/snap_sync.rs | 1153  | +4   |
+------------------------------------------------+-------+------+

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

No findings.

The change in snap_sync.rs and snap_sync.rs looks sound. set_move_files(true) only changes how temporary SST files are ingested into the ephemeral RocksDB, and the assumption behind it is supported by the path helpers in utils.rs and utils.rs: both snapshot dirs and temp DB dirs are siblings under the same datadir. I don’t see any EVM/consensus/gas/state-validation impact from this diff; it should just reduce peak disk usage during snap sync.

Testing gap: I couldn’t run a targeted cargo check here because rustup tried to write under /home/runner/.rustup and hit the sandbox’s read-only restriction. If CI hasn’t already, a RocksDB-enabled snap-sync path is the one thing still worth validating.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

Benchmark Block Execution Results Comparison Against Main

Command Mean [s] Min [s] Max [s] Relative
base 65.749 ± 0.452 65.355 66.805 1.00
head 66.257 ± 0.361 65.577 66.648 1.01 ± 0.01

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client performance Block execution throughput and performance in general

Projects

Status: No status
Status: Todo

Development

Successfully merging this pull request may close these issues.

3 participants