Gloas lookup sync#9155
Conversation
| .contains_block(root) | ||
| } | ||
|
|
||
| // TODO(gloas): implement this once issue #8956 is resolved |
There was a problem hiding this comment.
This is resolved now. We can use canonical head snapshot to check this
| "Processing payload attestation message" | ||
| ); | ||
|
|
||
| // Trigger lookup sync by beacon block root. Treat payload attestations as unknown block |
There was a problem hiding this comment.
would this also act as a trigger for the payload itself depending on the value of payload_present?
can be done in a future PR too
| let parent_in_fork_choice = cx | ||
| .chain | ||
| .canonical_head | ||
| .fork_choice_read_lock() |
There was a problem hiding this comment.
I think checking here in fork choice might lead to weird situations when the block is partially verified and in the da checker, but not yet in fork choice.
There was a problem hiding this comment.
I am a little uncomfortable with a direct check on fork choice here. Might lead to weird edge cases because we check for a block's existence in different places. For e.g. we check the chain directly here
future refactors would need to be aware of this. Is handling ParentUnknown failures and making them trigger parent requests like before safer?
| state.make_request(|| cx.block_lookup_request(id, peers, block_root))?; | ||
|
|
||
| if state.is_completed() { | ||
| // Block already imported (e.g. execution validated). Nothing further |
There was a problem hiding this comment.
I don't think this is accurate. A block being execution validated doesn't mean we have the data blobs/column/payload. We may need to pull the block from get_block_process_status and then just continue the request for the other components.
| state: SingleLookupRequestState<Arc<SignedExecutionPayloadEnvelope<E>>>, | ||
| }, | ||
| Downloaded { | ||
| peer_group: PeerGroup, |
There was a problem hiding this comment.
Shouldn't this have the actual downloaded data too if we want gloas to work too?
| }, | ||
| /// Block processing complete | ||
| Complete { | ||
| #[educe(Debug(ignore))] |
There was a problem hiding this comment.
Is it worth keeping the peer_id in the Complete state for later penalization? is this because we are assuming block peer is always right if the processing passed?
| self.data_request = DataRequest::Complete; | ||
| self.continue_requests(cx) | ||
| } else { | ||
| // Data processing failed — retry data download only |
There was a problem hiding this comment.
could we get stuck because of not incrementing failed_processing here?
| self.payload_request = PayloadRequest::Complete; | ||
| self.continue_requests(cx) | ||
| } else { | ||
| self.payload_request = PayloadRequest::Downloading { |
|
|
||
| // Peer sets. | ||
| // | ||
| // `Arc<RwLock<..>>` is used so the same peer set can be shared between the lookup and |
There was a problem hiding this comment.
This is only true for columns right? so the data_peers and payload peers are similar just for consistency?
eb40ff0 to
3ff8491
Compare
Rewrites the single block lookup state machine for Gloas, where block, data (blobs/columns), and execution payload envelope are independent components that can arrive and import out of order. - Three additive-only sub-state-machines for block / data / payload streams. Peer sets start empty for data/payload and grow as children arrive — the parent lookup's completion requirement can widen over time without mutating any state machine. - `AwaitingParent` becomes a struct carrying the child's `parent_block_hash` so the parent can be classified empty/full from the child's bid reference. - Wires `PayloadEnvelopesByRoot` RPC end-to-end through `SyncNetworkContext`: request sending, response routing (`SingleLookupReqId::SinglePayloadEnvelope`), and integration into `PayloadRequest`. Envelope *processing* is still a TODO; only the download path is wired. - Test rig: serves envelopes from a `network_envelopes_by_root` cache populated from the external harness; bumps test validator count to 8 so `proposer_lookahead` can populate at the Fulu → Gloas upgrade. - Enables gloas in `TEST_NETWORK_FORKS`. - Fixes: genesis parent check, infinite retry loop on repeated download failure, no-op in `on_completed_request`, and peer sets not being cleared on disconnect.
3ff8491 to
ebe9fe2
Compare
What this PR does
Extends the single-block lookup to work with Gloas, where a block, its blob/column data, and its execution payload envelope are separate things that can arrive independently and out of order. The existing trait-based request system was starting to creak under that, so it's replaced with three small state machines — one per component — driven by a single
continue_requestsloop. No component ever mutates another; streams are only added to.A parent's "full" vs "empty" classification needs a child's bid to tell them apart, so
AwaitingParentnow carries the child'sparent_block_hashalongside the parent root. When a child classifies its parent as full, it contributes its peers to the parent's payload/data streams.The
PayloadEnvelopesByRootRPC is wired throughSyncNetworkContext— request dispatch, response routing, and hooking into the newPayloadRequeststate machine. Envelope processing (beacon-processor wiring, import path) is intentionally left as a TODO and should land in a follow-up PR. Envelopes imported from gossip are untouched by this PR.Scope notes
chain.get_block_process_status()if the block hasn't yet been downloaded by the lookup but is in the availability checker — matching the existing sigp/unstable behaviour where block and blob/column requests proceed in parallel.network_envelopes_by_rootcache populated from the external harness, and the test harness uses 8 validators (Gloas's proposer lookahead doesn't populate with 1).on_completed_requestno-op, peer sets not cleared on disconnect.TEST_NETWORK_FORKSunchanged — enabling Gloas in that matrix is blocked on a separate clean-up pass (several existing Gloas tests still panic against this rewrite).Review notes
Pre-upstream review on dapplion#68 (@pawanjay176 — thanks!). The first-round comments are addressed in this push; the remaining ones from 2026-04-22 I'll work through inline. Payload-processing wiring is explicitly deferred per that review thread.
Test plan
make lintcleanFORK_NAME=phase0 cargo nextest run --features fork_from_env,fake_crypto -p network sync::tests::— 79/79FORK_NAME=electra cargo nextest run --features fork_from_env,fake_crypto -p network sync::tests::lookups::— 59/59FORK_NAME=fulu cargo nextest run --features fork_from_env,fake_crypto -p network sync::tests::lookups::— 59/59