Skip to content

Implement beacon_blocks_by_head#9237

Open
dapplion wants to merge 11 commits intosigp:unstablefrom
dapplion:claude-lh-bb-by-head
Open

Implement beacon_blocks_by_head#9237
dapplion wants to merge 11 commits intosigp:unstablefrom
dapplion:claude-lh-bb-by-head

Conversation

@dapplion
Copy link
Copy Markdown
Collaborator

@dapplion dapplion commented Apr 30, 2026

Implements consensus-specs PR 5181: a new Fulu-only req/resp route /eth2/beacon_chain/req/beacon_blocks_by_head/1/. The request is (beacon_root: Hash256, count: u64); the responder walks the parent chain of beacon_root (inclusive) and emits up to min(count, MAX_REQUEST_BLOCKS_DENEB) blocks in descending slot order, one block per response_chunk (same shape as BeaconBlocksByRange v2).

Doesn't touch production paths, and we don't need coordinated deployment

AI disclosure

Collborated with opus 4.7 to write the code with heavy oversight. I have personally reviewed every single line of code multiple times, including tests

Implements consensus-specs PR 5181: a new Fulu-only req/resp route
`/eth2/beacon_chain/req/beacon_blocks_by_head/1/`. The request is
`(beacon_root: Hash256, count: u64)`; the responder walks the parent
chain of `beacon_root` (inclusive) and emits up to
`min(count, MAX_REQUEST_BLOCKS_DENEB)` blocks in descending slot
order, one block per `response_chunk` (same shape as
`BeaconBlocksByRange v2`).

Walk stops when `count` blocks have been emitted or when the parent
chain becomes locally unavailable. Returns `ResourceUnavailable` if
`beacon_root` itself is unknown.

Wired through:
- `Protocol::BlocksByHead` / `SupportedProtocol::BlocksByHeadV1`,
  registered in `currently_supported()` only when Fulu is scheduled.
- `BlocksByHeadRequest` SSZ container (40 bytes, fixed).
- Codec encode/decode with fork-context dispatch (Fulu, Gloas).
- Per-protocol rate limit (`DEFAULT_BLOCKS_BY_HEAD_QUOTA = 128/10s`).
- New `Work::BlocksByHeadRequest` async work item with its own queue,
  scheduled alongside existing block requests.
- Inbound handler `handle_blocks_by_head_request` in
  `network_beacon_processor`, wired through `router.rs`. The parent
  walk runs on a blocking thread via the dedicated
  `get_block_roots_ancestor_of_head` helper; streaming, error
  handling and `Ok(None)` semantics mirror `BlocksByRange`.

Outbound (sync-side) consumption is intentionally out of scope.

Codec round-trip and protocol-registration tests cover the new
variant.
Previously `get_block_roots_ancestor_of_head` walked the parent chain
by calling `get_blinded_block` per ancestor — N store reads just to
extract `parent_root`/`slot`. The streamer then re-fetched each block,
yielding 2× store reads per request.

Switch to:
- `fork_choice_read_lock().proto_array().iter_block_roots(&head_root)`
  for the in-memory parent walk (zero DB reads).
- `chain.forwards_iter_block_roots(start_slot)` (the freezer's
  slot→root index, no block bodies loaded) for spillover below the
  proto-array boundary; parents of finalized blocks are canonical so
  the canonical freezer iter is correct.

`chain.get_blocks(roots)` then performs the only DB load per block,
batched + pipelined via `BeaconBlockStreamer`'s
`getPayloadBodiesByRangeV1` path. Skip-slot duplicates from the
forwards iter are deduped before truncating to `remaining`.
The previous spillover computed `start_slot = oldest_slot - remaining`
and asked the freezer for that range — wrong with skip slots: a
sparsely-filled window yields fewer unique blocks than `remaining`,
so we silently returned a short response.

Switch to walking the head state's `block_roots` field (the 8192-slot
in-memory circular buffer carried in every `BeaconState`) backward
slot-by-slot, deduping adjacent duplicates produced by skip slots, and
stopping exactly when `count` blocks are collected. Zero DB reads for
the spillover — `block_roots` is already in memory on the head
snapshot.

For pathological requests whose ancestors fall outside the 8192-slot
window we simply stop walking; BlocksByHead's `count <= 128` cap means
this can't happen in practice.
Previously `get_block_roots_ancestor_of_head` returned `ResourceUnavailable`
whenever `head_root` was not in fork-choice's proto-array — including any
canonical block at or below the latest finalized checkpoint. The spec
requires us to serve at least one block if we have the block at
`beacon_root`, so this was non-compliant for any finalized root.

Now three cases are handled:
1. All ancestors in fork-choice → proto-array iter (existing path).
2. Mixed → proto-array yields the above-finalized portion, head state's
   `block_roots` bucket fills the rest (existing path).
3. `head_root` below finalized → fall back to one `get_blinded_block`
   to fetch its slot, verify it is canonical at that slot via
   `state.block_roots`, then walk the bucket for ancestors. If
   verification fails (non-canonical or outside the 8192-slot window)
   we still return the single block we found, satisfying the spec MUST.
`get_block_roots_ancestor_of_head` previously walked ancestors below the
proto-array boundary by indexing into the head state's `block_roots`
circular buffer. That bucket only spans ~8192 slots back from head, so
deeper walks were silently truncated, and using head-state lookups to
verify the canonicity of a sub-finalized `head_root` is the wrong source
of truth: it requires snapshotting and cloning the head state, and a
non-canonical hot-DB block at the same slot as a finalized canonical
block can shadow the freezer's canonical root.

Switch the spillover (and the case-2 canonicity check) to
`store.get_cold_block_root(slot)`, which reads the freezer DB's
`BeaconBlockRoots` column — the canonical slot→root index for finalized
blocks, populated for `[oldest_block_slot, split.slot)` with skip slots
reusing the prior block's root (same semantics as `state.block_roots`).

This collapses the prior three regimes into two: above-finalization is
served by proto-array (in-memory, no DB reads); at-or-below-finalization
is served by the freezer index. The head state is no longer consulted,
the walk-back window now extends all the way to `oldest_block_slot`, and
freezer holes (e.g. below `oldest_block_slot` on a checkpoint-synced
node) terminate cleanly instead of erroring.
The previous name forced the case-2 branches to read `if !from_proto_array`
— a negated check on a negated boolean. Inverting the variable lets the
branches read positively.
Drop the `roots_with_slots: Vec<(Hash256, Slot)>` accumulator and the
`head_below_finalization` boolean. Now the result is built directly as
`Vec<Hash256>`, with a `current_slot: Slot` scalar tracking where the
freezer walk picks up. The case-2 fallback (head_root at/below
finalization) does its canonicity check inline against the freezer index
before falling through to the spillover loop, which removes the second
`if` on the boolean.

No behavior change; just less collection-then-discard and a clearer flow.
Three end-to-end tests in the `TestRig` harness, exercising the
`get_block_roots_ancestor_of_head` paths the previous agents kept
getting wrong:

- `test_blocks_by_head_spillover_into_freezer`: 4-epoch chain so
  finalization migrates state to the freezer; request walks all the way
  back to slot 1, crossing the proto-array → freezer boundary.
- `test_blocks_by_head_finalized_root`: uses the finalized checkpoint's
  block root as `beacon_root` (case-2 fallback), verifying the
  `get_blinded_block` + freezer canonicity check + freezer walk path.
- `test_blocks_by_head_unknown_root`: a random root yields
  `ResourceUnavailable`.

A new `enqueue_blocks_by_head_request` helper mirrors the existing
`enqueue_blobs_by_*` helpers, and a small `drain_blocks_by_head_response`
utility reads the response stream up to its `None` terminator.
@dapplion dapplion requested a review from jxs as a code owner April 30, 2026 07:30
@dapplion dapplion added ready-for-review The code is ready for review Networking labels Apr 30, 2026
@dapplion dapplion changed the title Claude lh bb by head Implement beacon_blocks_by_head Apr 30, 2026
Comment thread beacon_node/lighthouse_network/src/rpc/protocol.rs Outdated
Copy link
Copy Markdown
Member

@jimmygchen jimmygchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Just a minor note on the server behaviour: when the request spans across fulu fork epoch (which is unlikely on mainnet), the server's response includes pre-fulu blocks - which is a small deviation from the spec, it might be worth adding a check there.

  1. The requester sends a request with a post fulu beacon root in the first fulu epoch, with count = 128 (not implemented in this PR)
  2. The server sends all blocks including ones before fulu fork
  3. The requester rejects non fulu blocks, and RpcError::InvalidRequest would penalise the peer with LowToleranceError

@jimmygchen jimmygchen added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Apr 30, 2026
dapplion and others added 2 commits May 1, 2026 10:37
Co-authored-by: Jimmy Chen <jchen.tc@gmail.com>
`BlocksByHeadV1` previously rejected pre-Fulu blocks at the response
codec with `InvalidRequest`. The protocol is new in Fulu but the wire
shape is just `SignedBeaconBlock`, and a Fulu peer's parent walk
naturally crosses the Fulu fork boundary — the server has the older
canonical blocks and should be able to serve them, mirroring how
`BlocksByRangeV2` and `BlocksByRootV2` accept all eight fork variants.

Replace the Fulu-only arm with the same Base→Gloas fan-out used by
`BlocksByRootV2`. The server-side handler is already fork-agnostic
(`chain.get_blocks(...)` streams whichever variant the block is), so
relaxing the wire codec is the only change needed.

Adds a small `test_blocks_by_head_decodes_all_forks` round-trip test to
guard the new arms against regressions.
},
SupportedProtocol::BlocksByHeadV1 => match fork_name {
Some(ForkName::Base) => Ok(Some(RpcSuccessResponse::BlocksByHead(Arc::new(
SignedBeaconBlock::Base(SignedBeaconBlockBase::from_ssz_bytes(decoded_buffer)?),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could just use SignedBeaconBlock::from_ssz_bytes_by_fork here

Per Jimmy's review: collapse the eight-arm fork match to a single
`from_ssz_bytes_by_fork(bytes, fork_name)` call. Net −30/+3 lines, same
behavior, easier to keep in sync if a future fork is added.
@dapplion dapplion added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Networking ready-for-review The code is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants