refactor(network): declarative P2P capability advertisement via IP2PCapabilityResolver#12093
refactor(network): declarative P2P capability advertisement via IP2PCapabilityResolver#12093asdacap wants to merge 14 commits into
Conversation
Register IProtocolsManager in NetworkModule and inject all of InitializeNetwork's dependencies via the constructor (Lazy<> for the conditionally-used ones), removing every _api access from the step. IApiWithNetwork.ProtocolsManager becomes read-only and is resolved from DI, so the FallbackToFieldFromApi redundancy guard requires dropping its setter. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address review: register SnapCapabilitySwitcher in NetworkModule instead of constructing it in InitializeNetwork. It is IDisposable so Autofac owns its disposal (replacing the manual DisposeStack.Push), letting the step drop its Lazy<ISyncModeSelector> and Lazy<IDisposableStack> dependencies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The InitializeNetwork decoupling left several usings orphaned in NetworkModuleTest and Build (Core.Specs, State, TxPool, Stats, Network). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…apabilityResolver Replace the scattered imperative IProtocolsManager.Add/RemoveSupportedCapability calls (MergePlugin, InitializeNetwork snap-serving, SnapCapabilitySwitcher, XdcPlugin) with an IP2PCapabilityResolver interface. ProtocolsManager caches the advertised capability set and rebuilds it only when a resolver raises Changed, keeping the per-session Register path allocation-free. - MergeP2PCapabilityResolver: eth/69-71 once PoSSwitcher.TransitionFinished (registered in MergePluginModule + AuRaMergeModule only; OP/Taiko unchanged) - SnapP2PCapabilityResolver: snap/1 while serving or snap-syncing (replaces SnapCapabilitySwitcher and its sync-mode event wiring) - XdcP2PCapabilityResolver: eth/62/63/100 instead of the default eth/68 Trim AddSupportedCapability/RemoveSupportedCapability from IProtocolsManager. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address review: move the hardcoded DefaultCapabilities seed out of ProtocolsManager into a DefaultP2PCapabilityResolver (registered AddFirst so chain-specific resolvers still compose over it). GetAdvertisedCapabilities now runs all resolvers over an empty set. Also drop a stray comment in PseudoNetworkModule. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…geplugin-protocol-toggle # Conflicts: # src/Nethermind/Nethermind.Init/Modules/NetworkModule.cs # src/Nethermind/Nethermind.Init/Steps/InitializeNetwork.cs # src/Nethermind/Nethermind.Runner.Test/Module/NetworkModuleTest.cs
Collapse DefaultP2PCapabilityResolver.Resolve to a single Add and remove the now-unused static DefaultCapabilities list; P2PProtocolInfoProvider derives the default set by running the resolver over an empty set. Log the resolved advertised capability set at debug in ProtocolsManager. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Claude finished @asdacap's task in 5m 56s —— View job PR Review: declarative P2P capability advertisement via IP2PCapabilityResolver
SummaryClean refactor overall. The declarative resolver model is a clear improvement over the previous imperative FindingsMedium
The old private void OnSyncModeChanged(object? sender, SyncModeChangedEventArgs e)
{
bool wasSnap = _syncConfig.SnapSync && (e.Previous & SyncMode.Full) == 0;
bool isSnap = _syncConfig.SnapSync && (e.Current & SyncMode.Full) == 0;
if (wasSnap != isSnap)
Changed?.Invoke();
}LowNullability gap in
Inconsistent null guarding for array constructor parameters (inline comment)
Observability regression: snap capability transitions are no longer logged (inline comment)
Correctness / design notes (no action required)
|
|
Claude finished @asdacap's task in 7m 1s —— View job PR Review: declarative P2P capability advertisement via IP2PCapabilityResolver
SummaryThe refactor is clean and well-reasoned. The declarative Correctness verified:
No critical or high findings. Three low-severity issues noted as inline comments:
Additional note (not in diff, no inline possible): The comment at Overall, the implementation is solid. Tests are comprehensive (unit tests per resolver + caching/invalidation test in |
| } | ||
| } | ||
|
|
||
| private void OnSyncModeChanged(object? sender, SyncModeChangedEventArgs e) => Changed?.Invoke(); |
There was a problem hiding this comment.
Medium – Fires Changed on every sync-mode transition, not just snap-relevant ones
private void OnSyncModeChanged(object? sender, SyncModeChangedEventArgs e) => Changed?.Invoke();This unconditionally fires Changed on every ISyncModeSelector.Changed event. During initial sync, SyncMode transitions frequently (e.g. Headers → Bodies → Receipts → StateNodes). Each transition invalidates ProtocolsManager's cache and triggers a full rebuild, even when neither SnapServingEnabled nor the Full-mode check would yield a different result.
The PR description states capabilities "change ~twice per node lifetime," but this resolver alone fires during every sync-mode step. SnapCapabilitySwitcher only triggered on Full — the single moment snap should be removed.
Consider restricting invalidation to the transitions that actually change the snap capability:
private void OnSyncModeChanged(object? sender, SyncModeChangedEventArgs e)
{
bool wasSnap = _syncConfig.SnapSync && (e.Previous & SyncMode.Full) == 0;
bool isSnap = _syncConfig.SnapSync && (e.Current & SyncMode.Full) == 0;
if (wasSnap != isSnap)
Changed?.Invoke();
}| Volatile.Write(ref _cachedCapabilities, resolved); | ||
| } | ||
|
|
||
| return _cachedCapabilities; |
There was a problem hiding this comment.
Low – Nullability gap: returning Capability[]? as Capability[]
_cachedCapabilities is declared Capability[]?, but this method returns Capability[]. The return here is logically non-null (the lock prevents InvalidateCapabilities from racing), but the compiler doesn't see that flow and should emit CS8603. If the project has <Nullable>enable</Nullable>, this is a suppressed or missed warning.
Either restructure to use a local:
lock (_capabilitiesLock)
{
if (_cachedCapabilities is null)
{
...
Volatile.Write(ref _cachedCapabilities, resolved);
return resolved;
}
return _cachedCapabilities;
}Or suppress explicitly with return _cachedCapabilities!;.
|
|
||
| // Order is already set by OrderedComponents<T> (AddFirst/AddLast) | ||
| _factories = factories; | ||
| _capabilityResolvers = capabilityResolvers; |
There was a problem hiding this comment.
Low – Inconsistent null guarding for array parameters
factories and capabilityResolvers are not null-checked, unlike the other constructor parameters. If either is null, the foreach on line 70 throws NullReferenceException rather than the ArgumentNullException that other parameters produce.
Since Autofac injects [] for empty array registrations (not null), this is not a practical issue, but the inconsistency is worth aligning:
ArgumentNullException.ThrowIfNull(factories);
ArgumentNullException.ThrowIfNull(capabilityResolvers);|
|
||
| private void OnSyncModeChanged(object? sender, SyncModeChangedEventArgs e) => Changed?.Invoke(); | ||
|
|
||
| public void Dispose() => _syncModeSelector.Changed -= OnSyncModeChanged; |
There was a problem hiding this comment.
Low – Observability regression: snap transition is no longer logged
SnapCapabilitySwitcher logged at info/debug when snap was enabled and when sync completed and snap was disabled:
"Enabled snap capability"
"State sync finished. Disabled snap capability."
The new resolver is silent. Adding a debug log in Resolve() (or a conditional log in OnSyncModeChanged) when the snap state transitions would preserve that visibility for node operators and devs debugging sync issues.
|
|
||
| public void Resolve(ISet<Capability> capabilities) | ||
| { | ||
| bool serving = _syncConfig.SnapServingEnabled == true; |
There was a problem hiding this comment.
Nit – SnapServingEnabled null-check is reversed from intent
_syncConfig.SnapServingEnabled == true is correct (handles null safely), but it is slightly inconsistent with how SnapSync is checked on the next line (_syncConfig.SnapSync — a plain bool). Fine as-is; just noting the asymmetry for reviewers.
| Volatile.Write(ref _cachedCapabilities, resolved); | ||
| } | ||
|
|
||
| return _cachedCapabilities; |
There was a problem hiding this comment.
Low: _cachedCapabilities is declared Capability[]? and the project uses <Nullable>annotations</Nullable> so no warning fires, but the null-forgiving operator would document the invariant that this field cannot be null here (it was either already non-null when we entered the lock, or we just wrote it via Volatile.Write).
| return _cachedCapabilities; | |
| return _cachedCapabilities!; |
| .AddModule(new TestMergeModule(configProvider.GetConfig<ITxPoolConfig>())) | ||
| .AddSingleton<ManualTimestamper>(timestamper) // Used by test code | ||
| .AddDecorator<ITestEnv, PostMergeTestEnv>() | ||
| .AddLast<IP2PCapabilityResolver>(_ => new PostMergeCapabilitiesResolver()) |
There was a problem hiding this comment.
Low: Per the DI rules in .agents/rules/di-patterns.md, the lambda factory form is an anti-pattern. Since PostMergeCapabilitiesResolver has no dependencies, prefer the type-based overload — even for a private nested type:
| .AddLast<IP2PCapabilityResolver>(_ => new PostMergeCapabilitiesResolver()) | |
| .AddLast<IP2PCapabilityResolver, PostMergeCapabilitiesResolver>() |
| } | ||
| } | ||
|
|
||
| private void OnSyncModeChanged(object? sender, SyncModeChangedEventArgs e) => Changed?.Invoke(); |
There was a problem hiding this comment.
Low: Changed fires on every sync-mode transition, not only when the snap contribution actually flips. If the mode goes from StateNodes → FastBlocks both sides are "not Full", so the cache is rebuilt unnecessarily. In practice, mode changes are rare and rebuilds are cheap (one HashSet allocation per resolver), so this is harmless. But if you want to be precise:
private void OnSyncModeChanged(object? sender, SyncModeChangedEventArgs e)
{
bool wasFull = (e.Previous & SyncMode.Full) != 0;
bool isFull = (e.Current & SyncMode.Full) != 0;
if (wasFull != isFull)
Changed?.Invoke();
}|
|
||
| public void Resolve(ISet<Capability> capabilities) | ||
| { | ||
| capabilities.Remove(new Capability(Protocol.Eth, 68)); |
There was a problem hiding this comment.
Can this be done by deregistering DefaultCapabilityResolver?
There was a problem hiding this comment.
I guess, the Add* method can be modified to remove previous registration.
There was a problem hiding this comment.
Done in 6c28605 — added a RemoveOrderedComponents<T, TImpl>() DSL (next to ClearOrderedComponents, backed by OrderedComponents.RemoveAll), and XdcModule now calls .RemoveOrderedComponents<IP2PCapabilityResolver, DefaultP2PCapabilityResolver>() right after registering the XDC resolver. So the default eth/68 resolver is deregistered instead of XdcP2PCapabilityResolver removing eth/68 at resolve time — the XDC resolver now only contributes eth/62, eth/63 and eth/100. Covered by OrderedComponentsTests.TestRemove and the updated Xdc resolver test.
Posted by Claude (Opus 4.8) on behalf of @asdacap.
- SnapP2PCapabilityResolver fires Changed only when snap/1's contribution actually flips (Full-mode bit) instead of on every sync-mode transition, and logs the transition (restores the visibility SnapCapabilitySwitcher had). - ProtocolsManager: null-forgiving return for the cached array and ArgumentNullException guards for the factories/capabilityResolvers arrays, consistent with the other ctor params. - E2ESyncTests: register PostMergeCapabilitiesResolver by type instead of a factory lambda, per the DI rules. - Reword the now-stale 'capabilities must be finalized' comment in InitializeNetwork. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g at resolve Per review (LukaszRozmej): rather than XdcP2PCapabilityResolver imperatively removing eth/68 from the resolved set, deregister the DefaultP2PCapabilityResolver. Adds a RemoveOrderedComponents<T, TImpl>() DSL (alongside ClearOrderedComponents) backed by OrderedComponents.RemoveAll. XdcModule now drops the default resolver after registering the XDC one, so XdcP2PCapabilityResolver only contributes eth/62, eth/63 and eth/100. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…geplugin-protocol-toggle # Conflicts: # src/Nethermind/Nethermind.Init/Steps/InitializeNetwork.cs
| public class XdcP2PCapabilityResolverTests | ||
| { | ||
| [Test] | ||
| public void Resolve_advertises_xdc_capabilities() |
There was a problem hiding this comment.
Only tests the resolver in isolation - nothing asserts the DI composition actually drops eth/68 from XDC's set. Worth a container-level check?
There was a problem hiding this comment.
Added in 37dbf86 — a container-level test (Di_composition_drops_default_eth68_resolver) that mirrors NetworkModule (AddFirst default) + XdcModule (AddLast xdc, then RemoveOrderedComponents default), resolves IP2PCapabilityResolver[], and asserts the composed capability set excludes eth/68 (and that no DefaultP2PCapabilityResolver remains in the array) — not just the resolver in isolation.
Posted by Claude (Opus 4.8) on behalf of @asdacap.
|
|
||
| public void Resolve(ISet<Capability> capabilities) | ||
| { | ||
| if (!_poSSwitcher.TransitionFinished) return; |
There was a problem hiding this comment.
Gates on TransitionFinished but only subscribes to TerminalBlockReached, which fires while TransitionFinished is still false. it flips true later in ForkchoiceUpdated, no event. On a live TTD merge, eth/69-71 never get advertised. Should this also check _poSSwitcher.HasEverReachedTerminalBlock()?
There was a problem hiding this comment.
Good catch — confirmed regression for the live TTD path. The pre-resolver code added eth/69-71 unconditionally when TerminalBlockReached fired; my resolver re-gated Resolve on TransitionFinished, which is still false at that point (it only flips true later in PoSSwitcher.ForkchoiceUpdated, which raises no event), so the cache rebuild dropped them and they were never advertised until a restart.
Fixed in 5ab5f31: Resolve now gates on TransitionFinished || HasEverReachedTerminalBlock(). HasEverReachedTerminalBlock() is set right after TerminalBlockReached fires, and ProtocolsManager's cache invalidation is lazy (nulls the cache, rebuilds on next read), so the rebuild sees it true. Added a parameterized case for the live-merge sequence (terminal reached before transition finished).
Posted by Claude (Opus 4.8) on behalf of @asdacap.
…sition-finished On a live TTD transition, PoSSwitcher fires TerminalBlockReached while TransitionFinished is still false; the flag only flips true later in ForkchoiceUpdated, which raises no event. The resolver invalidated its cache on TerminalBlockReached but Resolve re-gated on TransitionFinished, so the rebuild still excluded eth/69-71 and they were never advertised until a restart. Gate on TransitionFinished || HasEverReachedTerminalBlock(), mirroring the pre-resolver behaviour (which added the capabilities unconditionally on TerminalBlockReached). Reported by AnkushinDaniil in review of #12093. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a container-level test mirroring NetworkModule (AddFirst default) + XdcModule (AddLast xdc, then RemoveOrderedComponents default), verifying the resolved capability set excludes eth/68 — not just the resolver in isolation. Reported by AnkushinDaniil in review of #12093. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
IDE0005: 'using Nethermind.Network' became dead when the imperative ProtocolsManager snap-capability registration was removed; NetworkDiagTracer resolves via the enclosing Nethermind.Core namespace and Network.Metrics via the Nethermind root. Fixes the 'Check code lint' CI failure. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Changes
Replaces the scattered imperative
IProtocolsManager.Add/RemoveSupportedCapabilitycalls with a declarativeIP2PCapabilityResolver(void Resolve(ISet<Capability>)+event Action? Changed).ProtocolsManagerruns all registered resolvers over an empty set and caches a frozenCapability[], invalidating only when a resolver raisesChanged. The advertised set changes ~twice per node lifetime (merge transition, snap-sync completion) whileRegisterruns ~100×/session-create, so the per-session path is a single cached-array read — zero allocation.DefaultP2PCapabilityResolver— eth/68 base, registeredAddFirstso it seeds before chain resolvers.MergeP2PCapabilityResolver— eth/69–71 onIPoSSwitcher.TransitionFinished; registered inMergePluginModuleandAuRaMergeModuleonly (notBaseMergePluginModule, so Optimism/Taiko are unchanged).SnapP2PCapabilityResolver— snap/1 while serving or still snap-syncing; fully replaces the formerSnapCapabilitySwitcher(recomputed per session instead of add-on-start / remove-on-Full).XdcP2PCapabilityResolver— removes eth/68, adds 62/63/100.P2PProtocolInfoProvider/EthStats derive the default advertised set by runningDefaultP2PCapabilityResolverrather than reading a static list.Merged latest
master(incl. #12069 decouplingInitializeNetworkfromINethermindApi); the conflictingSnapCapabilitySwitcherwiring is dropped in favour ofSnapP2PCapabilityResolver, which preserves both the snap-serving and snap-sync-until-synced behaviours.Types of changes
Testing
Requires testing
If yes, did you write tests?
Notes on testing
New unit tests for each resolver and for
ProtocolsManagercapability caching/invalidation;E2ESyncTestsexercises post-merge capability negotiation. Full solution build is green (0 warnings, 0 errors).Documentation
Requires documentation update
Requires explanation in Release Notes