Summary
Discovery v5 handler classes log routine peer-interaction failures (bad packets, failed handshakes, rejected peers) at DEBUG. On a live Ethereum network these messages fire hundreds of times per minute, making --logging=DEBUG output difficult to read when investigating unrelated issues.
This issue proposes downgrading 16 LOG.debug() calls across 9 files in org.ethereum.beacon.discovery.pipeline.* and ...schema.NodeSession to LOG.trace(). The messages remain available for deep protocol debugging they are simply no longer part of the default DEBUG stream.
Downstream discussion: besu-eth/besu#9691 "Tune Discovery v5 Logging Levels".
Measured Impact
We ran Besu 25.3.0 (pre-built release) against Ethereum mainnet for 5 minutes with --logging=DEBUG and captured the following:
| Level |
Lines |
Per minute |
| Total |
18,023 |
~3,605 |
| DEBUG |
8,029 |
~1,606 (~27/sec) |
| INFO |
67 |
~13 |
| WARN |
3 |
<1 |
| ERROR |
1 |
<1 |
Top DEBUG sources
| Class |
Count |
Notes |
RlpxAgent |
2,054 |
(out of scope for this repo) |
EthPeers |
1,254 |
(out of scope) |
DeFramer |
1,022 |
(out of scope) |
RecursivePeerRefreshState |
548 |
Discovery-related |
PeerDiscoveryController |
479 |
Discovery-related |
VertxPeerDiscoveryAgent |
85 |
Discovery-related |
Discovery-specific noise (originating in this repo's handlers)
| Message |
Count |
Per minute |
| "Handshake timed out" |
469 |
~94 |
| "Peer is unreachable" |
60 |
~12 |
| "Discarding invalid packet" |
25 |
~5 |
| Total discovery DEBUG |
~570 |
~114 |
Note: This capture was taken on a home machine with limited peer connectivity. A production mainnet node maintaining 50+ active peers will produce substantially higher volumes these numbers are a lower bound.
Affected Files
All 16 statements describe peer-caused events that our node handled correctly bad bytes, invalid ENRs, policy-rejected addresses, protocol violations, or sessions that timed out because a peer went silent. None indicate a local bug.
| File |
Count |
Nature of messages |
pipeline/handler/HandshakeMessagePacketHandler.java |
6 |
Missing WhoAreYou challenge, invalid ENR, wrong node ID, disallowed ENR, bad ID signature, failed message read |
pipeline/handler/WhoAreYouPacketHandler.java |
2 |
Nonce verification failed, failed to read message |
pipeline/info/FindNodeResponseHandler.java |
2 |
Rejecting invalid record, wrong distance in response |
schema/NodeSession.java |
1 |
Cancelling requests after peer timeout |
pipeline/handler/BadPacketHandler.java |
1 |
Bad packet received |
pipeline/handler/OutgoingParcelHandler.java |
1 |
Dropping packet to disallowed destination |
pipeline/handler/UnauthorizedMessagePacketHandler.java |
1 |
Failed to read unauthorized message |
pipeline/handler/MessagePacketHandler.java |
1 |
Failed to read message |
pipeline/handler/PacketSourceFilter.java |
1 |
Ignoring disallowed source |
| Total |
16 |
|
Proposed Fix
Change each of the 16 LOG.debug(...) calls above to LOG.trace(...). This:
- Removes ~114+ messages per minute from the default DEBUG stream, making Besu's DEBUG logs usable again for troubleshooting unrelated issues.
- Preserves every message for developers doing protocol-level debugging
--logging=TRACE still shows everything.
- Does not modify any
LOG.warn(), LOG.error(), or LOG.info() calls.
- All existing tests continue to pass.
Deliberately Out of Scope
Two additional LOG.debug() calls in schema/NodeSession.java were considered but intentionally left at DEBUG:
NodeSession#cancelAllRequests logging an exception thrown while completing a cancelled request's promise.
NodeSession#cancelAllRequests logging when clearRequestInfo returns null for a requestId we just iterated from the same map.
Both signal potential local code issues (an unexpected exception in our own cleanup path, or a possible race on requestIdStatuses) rather than routine peer behavior, so they stay visible at DEBUG for operators troubleshooting their own node.
Status
A PR with the 16-statement change is ready and will be opened once this issue is filed for discussion.
Summary
Discovery v5 handler classes log routine peer-interaction failures (bad packets, failed handshakes, rejected peers) at
DEBUG. On a live Ethereum network these messages fire hundreds of times per minute, making--logging=DEBUGoutput difficult to read when investigating unrelated issues.This issue proposes downgrading 16
LOG.debug()calls across 9 files inorg.ethereum.beacon.discovery.pipeline.*and...schema.NodeSessiontoLOG.trace(). The messages remain available for deep protocol debugging they are simply no longer part of the default DEBUG stream.Downstream discussion: besu-eth/besu#9691 "Tune Discovery v5 Logging Levels".
Measured Impact
We ran Besu 25.3.0 (pre-built release) against Ethereum mainnet for 5 minutes with
--logging=DEBUGand captured the following:Top DEBUG sources
RlpxAgentEthPeersDeFramerRecursivePeerRefreshStatePeerDiscoveryControllerVertxPeerDiscoveryAgentDiscovery-specific noise (originating in this repo's handlers)
Affected Files
All 16 statements describe peer-caused events that our node handled correctly bad bytes, invalid ENRs, policy-rejected addresses, protocol violations, or sessions that timed out because a peer went silent. None indicate a local bug.
pipeline/handler/HandshakeMessagePacketHandler.javapipeline/handler/WhoAreYouPacketHandler.javapipeline/info/FindNodeResponseHandler.javaschema/NodeSession.javapipeline/handler/BadPacketHandler.javapipeline/handler/OutgoingParcelHandler.javapipeline/handler/UnauthorizedMessagePacketHandler.javapipeline/handler/MessagePacketHandler.javapipeline/handler/PacketSourceFilter.javaProposed Fix
Change each of the 16
LOG.debug(...)calls above toLOG.trace(...). This:--logging=TRACEstill shows everything.LOG.warn(),LOG.error(), orLOG.info()calls.Deliberately Out of Scope
Two additional
LOG.debug()calls inschema/NodeSession.javawere considered but intentionally left at DEBUG:NodeSession#cancelAllRequestslogging an exception thrown while completing a cancelled request's promise.NodeSession#cancelAllRequestslogging whenclearRequestInforeturnsnullfor a requestId we just iterated from the same map.Both signal potential local code issues (an unexpected exception in our own cleanup path, or a possible race on
requestIdStatuses) rather than routine peer behavior, so they stay visible at DEBUG for operators troubleshooting their own node.Status
A PR with the 16-statement change is ready and will be opened once this issue is filed for discussion.