fix(eth): cap outgoing block bodies under the devp2p message size limit#12059
Conversation
…e corruption ZeroPacketSplitter.Encode read MaxFrameSize several times per message while DisableFraming() flips it from another thread (Session.EnableSnappy runs off the channel event loop). A concurrent flip yields a torn framing configuration and a malformed frame header, which ZeroFrameEncoder rejects with an EncoderException, tearing down the peer. Snapshot the value once per Encode and back it with a volatile field.
|
Claude finished @AnkushinDaniil's task in 2m 56s —— View job Code Review
SummaryThe fix is correct and minimal. The root cause — a TOCTOU race on Correctness AnalysisThe race (old code):
This produces a frame with a fabricated size in the header, which The fix (new code): private volatile int _maxFrameSize = Frame.DefaultMaxFrameSize;
// ...
int maxFrameSize = _maxFrameSize; // single volatile read; local thereafter
Overflow safety when Test Analysis
Padding equivalence in
Low-severity observations (non-blocking)1. Test race window relies on timing rather than a mid-encode barrier — The test doesn't force 2. Pre-existing: VerdictThe fix is correct, well-scoped, and adequately tested. Ready to merge after WIP label is removed. |
MessageSizeEstimator.EstimateSize(tx) counted only "100 + calldata length", ignoring access lists, authorization lists and other fields. Block bodies were therefore under-counted and the 2 MB soft batching limit let through responses far larger than the 16 MB devp2p / 24-bit RLPx frame limit. Such a frame's size field wraps, the frame encoder desyncs on the oversized buffer and emits a corrupt frame, dropping the peer (seen serving large bloated-state block bodies). - Estimate transactions from their actual encoded length so all fields count. - Add a hard ~15 MB cap to FulfillBlockBodiesRequest; a single body that alone exceeds the limit is omitted (responses are already sparse) since it cannot be served within the protocol message size.
FulfillBlockBodiesRequest skipped a leading body that alone exceeded the hard size cap and continued serving later bodies. A requester matches bodies to the requested hashes positionally (BlockDownloader.HandleResponse: bodies[i] vs BodiesRequests[i], guarded by ValidateBodyAgainstHeader), so a non-trailing omission misaligns every following body and is rejected as an invalid block, triggering a breach-of-protocol disconnect of the serving peer — the opposite of what the cap intends. Break at the cap instead, returning the prefix accumulated so far (matching the receipts path). A single body above the cap cannot fit a 16 MB devp2p frame on any client, so an empty prefix is the only correct response.
Estimate_tx_with_data_size only asserted EstimateSize == TxDecoder.GetLength, which restates the implementation and would pass even if the estimator were wrong. Add an implementation-independent oracle: 7 extra calldata bytes must increase the estimate by exactly 7, guarding against a regression to a constant or a heuristic that drops a field.
| // Use the actual encoded length so large non-calldata fields (access lists, | ||
| // authorization lists, etc.) are accounted for and not under-counted. | ||
| return (ulong)TxDecoder.Instance.GetLength(tx, RlpBehaviors.None); |
There was a problem hiding this comment.
Potential issue here is that estimations should have been very fast and now it is potentially slow, as we are measuring not estimating
There was a problem hiding this comment.
It's only called from FulfillBlockBodiesRequest, where we've already loaded the block from disk - so the GetLength walk is noise next to that. It's also the same call the mempool already uses to size txs for broadcast and caches in LightTransaction, and it just sums field lengths rather than serializing, so it's not really a new cost
There was a problem hiding this comment.
Maybe then remove that estimate method completely, and just do that length measurement in FulfillBlockBodiesRequest. Estimate name now is misleading.
Move the body-cap positional-matching rationale and the estimate-vs-measure note out of code comments and into the PR description; keep one-line comments at each site.
Background
When serving
GetBlockBodies, Nethermind could emit aBlockBodiesmessage larger than the devp2p 16 MB message limit (SnappyParameters.MaxSnappyLength). An RLPx frame encodes its size in a 24-bit field (max 16 MB); once snappy is negotiated framing is disabled, so an oversized message becomes a single oversized frame whose size wraps.ZeroFrameEncoderthen desyncs on the oversized buffer and emits a corrupt frame, and the peer is dropped:Root cause:
MessageSizeEstimator.EstimateSize(Transaction)returned100 + tx.Data.Length— it counted only calldata and ignored access lists, authorization lists and other fields. Transactions with large non-calldata fields were under-counted, so the 2 MB soft batching limit inFulfillBlockBodiesRequestadmitted far more than estimated. Observed serving a heavily-bloated state, a singleBlockBodiesresponse reached 26–47 MB, overflowing the frame and tearing down the connection on every reconnect.Changes
MessageSizeEstimator.EstimateSize(Transaction)now returns the actual encoded length (TxDecoder.GetLength) instead of100 + tx.Data.Length, so access lists, authorization lists and every other field are counted.FulfillBlockBodiesRequestenforces a hard ~15 MB cap (below the 16 MB protocol limit, mirroring the existing receipts hard cap). When the next body would cross the cap, serving stops and returns the bodies gathered so far; the requesting peer re-requests the remainder.Why return a prefix rather than skip the oversized body and continue
A requester matches returned bodies to the requested hashes positionally (
BlockDownloader.HandleResponse), so omitting a non-trailing body would misalign every following one and the response would be rejected as invalid. Returning the prefix collected so far keeps the response well-aligned and lets the peer make progress and re-request from where serving stopped. A single body that alone exceeds the cap cannot be served within a devp2p frame at all, so an empty prefix is the only correct response in that case.On
EstimateSizeprecision vs. costEstimateSize(Transaction)now measures rather than estimates. This is intentional and confined to the body-serving path: its only caller isEstimateSize(Block)insideFulfillBlockBodiesRequest, which has already loaded each block from the DB (SyncServer.Find) — theGetLengthwalk is negligible next to that I/O. Transaction broadcast and other latency-sensitive paths use a different type (Nethermind.Synchronization.FastBlocks.MemorySizeEstimator) and are untouched, so no fast path is slowed. The cheap approximate estimate was the original behaviour and is exactly what under-counted bloated bodies and caused this bug; a correct cap requires the true size.Types of changes
Testing
Requires testing: Yes — wrote tests: Yes.
MessageSizeEstimatorTestsupdated to assert the estimate equals the encoded length, plus a newEstimate_tx_counts_access_listregression test (a tx with a large access list is no longer under-counted).Nethermind.Network.Testestimator tests pass locally.EncoderExceptions and made almost no progress; after,EncoderExceptions and drops drop to zero and body/state serving runs continuously (the link went from idle/stalled to fully utilised).Remarks
This branch keeps its original name (
fix/snap-serving-outbound-framing-race) from an earlier, incorrect hypothesis (an RLPx framing race); that change has been reverted and the actual fix is the block-bodies size cap above.