Skip to content

(WIP) feat: backfill sync v2#9238

Draft
jeffoodchain wants to merge 10 commits intoChainSafe:unstablefrom
jeffoodchain:feat/backfill-v2
Draft

(WIP) feat: backfill sync v2#9238
jeffoodchain wants to merge 10 commits intoChainSafe:unstablefrom
jeffoodchain:feat/backfill-v2

Conversation

@jeffoodchain
Copy link
Copy Markdown

@jeffoodchain jeffoodchain commented Apr 20, 2026

Motivation

reconstructing backfill sync feature in lodestar.

Description

Block

DB schema related changes

  • a new BackfillStateRepository: per-epoch state with fields
    • hasBlock
    • hasBlob
    • columnIndices
  • turned BackfilledRanges from repository -> singleton since it's always a single value, never a keyed collection.

backfillV2.ts

  • Walks backward from the anchor block by parentRoot, one block per request. (This might seems not efficient but IMO backfill sync has the low priority so making the thread light is important.)
  • Buffers blocks per-epoch and flushes on epoch boundaries (handles skipped
    slots at boundaries correctly)
  • Stops when anchorRoot === ZERO_HASH (i.e. we've walked past the genesis
    block's parent) (TODO: changing to MIN_EPOCHS_FOR_BLOCK_REQUESTS as default for block, MIN_EPOCHS_FOR_BLOB_SIDECARS_REQUESTS for blobs)
  • Tracks peer failures locally; disconnects peers after repeated failures
  • Resumes from a previous session via db.backfilledRange

DB schema refactor:

  • New BackfillStateRepository (backfill_state bucket): per-epoch state
    (hasBlock, hasBlobs, columnIndices) so future work can fill blobs/
    columns independently of blocks.
  • backfilledRange moved from Repositorysingle/ (singleton pattern)
    since it's always a single value, never a keyed collection.
  • Added mock types in mockedBeaconDb for the new/changed entries.

Blobs

haven't done anything related to backfill blobs yet.

tests

Unit tests in test/unit/sync/backfill/backfillV2.test.ts cover:

  • Walking a mainnet fixture (slot 4 -> genesis; ) by root
  • Completing cleanly at genesis
  • Epoch-boundary flush ordering
  • Skipped-slot handling at epoch boundaries

AI Assistance Disclosure

This PR was written primarily by Claude Code. I consulted Claude Code to understand the codebase, but the solution was fully authored manually by myself.

TODOs

  • adding blobs syncing
  • adding flag for genesis block sync

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new backfill synchronization implementation (BackfillSyncV2) and updates the database schema to track backfill progress using epochs. Key changes include the addition of BackfillStateRepository and the BackfilledRange singleton, while the legacy BackfillSync and BackfilledRanges are being deprecated or disabled. Feedback highlights a breaking database schema change due to bucket repurposing, the presence of non-functional legacy code, and performance concerns regarding block-by-block network fetching and sequential database lookups in the new implementation. Additionally, starting the sync process within the constructor is noted as a testing and control issue.

// index_lightClientInitProof = 36, // DEPRECATED on v0.32.0

backfilled_ranges = 42, // Backfilled From to To, inclusive of both From, To
backfill_state = 42, // Epoch -> EpochBackfillState
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Repurposing bucket ID 42 from backfilled_ranges (which stored Slot -> Slot) to backfill_state (which stores Epoch -> BackfillStateWrapper) is a breaking change for the database schema. Existing databases will contain data in the old format, which will cause deserialization errors or logic bugs when accessed by the new code. A database migration should be provided, or a new bucket ID should be allocated for the new backfill state.

Comment on lines +245 to +246
const backfillRangeWrittenSlot = null as number | null;
const previousBackfilledRanges = [] as {key: number; value: number}[];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The legacy BackfillSync implementation is being broken here by hardcoding backfillRangeWrittenSlot to null and previousBackfilledRanges to an empty array. If this class is intended to be deprecated and replaced by BackfillSyncV2, it should be formally deprecated or removed. Leaving it in a non-functional state with TODO comments can lead to confusion and bugs if it's still reachable in certain configurations.

Comment on lines +20 to +43
// try {
// const finalizedBlockFC = chain.forkChoice.getBlockHexDefaultStatus(finalized.rootHex);
// if (finalizedBlockFC && finalizedBlockFC.slot > chain.anchorStateLatestBlockSlot) {
// await db.backfilledRanges.put(finalizedBlockFC.slot, chain.anchorStateLatestBlockSlot);
//
// const filteredSeqs = await db.backfilledRanges.entries({
// gt: chain.anchorStateLatestBlockSlot,
// lt: finalizedBlockFC.slot,
// });
// logger.debug("updated backfilledRanges", {
// key: finalizedBlockFC.slot,
// value: chain.anchorStateLatestBlockSlot,
// });
// if (filteredSeqs.length > 0) {
// await db.backfilledRanges.batchDelete(filteredSeqs.map((entry) => entry.key));
// logger.debug(
// `Forward Sync - cleaned up backfilledRanges between ${finalizedBlockFC.slot},${chain.anchorStateLatestBlockSlot}`,
// {seqs: JSON.stringify(filteredSeqs)}
// );
// }
// }
// } catch (e) {
// logger.error("Error updating backfilledRanges on finalization", {epoch: finalized.epoch}, e as Error);
// }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for updating the backfill range during finalization has been commented out. This effectively disables progress tracking for backfill sync during normal node operation. If this logic is being replaced by the new backfillState and backfilledRange singleton, it should be updated to use them instead of being left as commented-out code.

Comment on lines +108 to +123
this.sync()
.then((oldestSlotSynced) => {
if (this.status !== BackfillSyncStatus.completed) {
throw new ErrorAborted(`Invalid BackfillSyncStatus at completion: status = ${this.status}`);
}
this.emit(BackfillSyncEvent.completed, oldestSlotSynced);
this.logger.info("BackfillSync completed", {oldestSlotSynced});
this.close();
})
.catch((e) => {
if (!(e instanceof ErrorAborted)) {
this.logger.error("BackfillSync processor error", e);
}
this.status = BackfillSyncStatus.aborted;
this.close();
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Starting a long-running asynchronous process like sync() directly in the constructor is generally considered a bad practice. It makes the object difficult to test, as the sync process starts immediately upon instantiation, and it doesn't allow the caller to control when the synchronization begins. Consider moving the call to this.sync() to an explicit start() method.

}
try {
// how could they fetch a block via root?
const [block] = await this.network.sendBeaconBlocksByRoot(peer, [this.anchorRoot]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Fetching blocks one by one using sendBeaconBlocksByRoot is highly inefficient due to network round-trip times (RTT). Even if backfill sync is considered low priority, fetching a single block per request will make the process extremely slow for a long chain. Consider fetching blocks in batches using sendBeaconBlocksByRange or requesting multiple roots at once if the protocol supports it.

Comment on lines +330 to +333
while (epoch > 0) {
const state = await this.db.backfillState.get(epoch);
if (!state?.hasBlock) break;
epoch--;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This while loop performs sequential database lookups for every epoch. If a node has already backfilled a large number of epochs, this loop could perform thousands of asynchronous DB calls, potentially blocking the event loop and causing performance issues. It would be more efficient to use a reverse stream or iterator (e.g., this.db.backfillState.keysStream({lt: startEpoch, reverse: true})) to find the first gap in the backfilled epochs.

@jeffoodchain jeffoodchain changed the title Feat/backfill v2 feat: backfill sync v2 Apr 20, 2026
- Introduced comprehensive tests for the backfillV2 synchronization process, covering scenarios such as chain walking, flushing blocks at epoch boundaries, and handling skipped slots.
- Added a check for the aborted signal in the BackfillSync class to ensure proper termination of the sync process when the signal is triggered.
@jeffoodchain jeffoodchain changed the title feat: backfill sync v2 (WIP) feat: backfill sync v2 Apr 20, 2026
I found the peer score should be not that harsh because some behaviors are legitimate according to the spec.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant