[Draft]Add antreanodeconfig crd agent#119
Conversation
a5bbaef to
95720de
Compare
95720de to
db6dcaa
Compare
db6dcaa to
cd65b32
Compare
cd65b32 to
178d747
Compare
178d747 to
c642a08
Compare
c642a08 to
ab802dd
Compare
862ac4f to
9fe8413
Compare
9fe8413 to
5aaaba7
Compare
bc8ae5c to
b868b1a
Compare
d054735 to
1e626ed
Compare
| // wait for the first ANC snapshot before Initialize so the effective bridge is known. | ||
| if features.DefaultFeatureGate.Enabled(features.SecondaryNetwork) { | ||
| if err := secondaryNetworkController.WaitForInitialANCSnapshotAndEnsureBridge(stopCh); err != nil { | ||
| return fmt.Errorf("failed to wait for AntreaNodeConfig snapshot for secondary network: %w", err) |
There was a problem hiding this comment.
Potential deadlock with AntreaNodeConfig initialization order
High Severity
When both SecondaryNetwork and AntreaNodeConfig features are enabled, WaitForInitialANCSnapshotAndEnsureBridge is called at line 995 to wait for the first AntreaNodeConfig snapshot. However, this blocking wait occurs during the agent's initialization sequence, before many controllers have started their work. The antreaNodeConfigController.Run is started at line 882, but the subscription callback that closes ancFirstSnapshotCh (set up at lines 159-172 in secondarynetwork/init.go) requires the controller to process events and publish snapshots. If the controller's event processing is delayed or depends on other initialization steps that happen after line 995, this could cause a deadlock or extended startup delay, especially since the wait is blocking the main initialization flow.
Reviewed by Cursor Bugbot for commit 1e626ed. Configure here.
1e626ed to
0f4b2b2
Compare
| fi | ||
|
|
||
| trap "quit" INT EXIT | ||
| # trap "quit" INT EXIT |
There was a problem hiding this comment.
Commented out trap disables cleanup on exit
High Severity
The trap "quit" INT EXIT statement is commented out, preventing cleanup from running when the script exits normally or is interrupted. This means the quit function won't execute, leaving stale Kind clusters and Docker networks that should be cleaned up. The cleanup logic is defined but never triggered.
Reviewed by Cursor Bugbot for commit ec6c54f. Configure here.
| docker network connect --driver-opt=com.docker.network.endpoint.ifname=$ifname $network $node | ||
| docker network connect "${extra_gw_priority[@]}" --driver-opt=com.docker.network.endpoint.ifname=$ifname "$network" "$node" | ||
| echo "connected worker $node to network $network" | ||
| i=$((i+1)) |
There was a problem hiding this comment.
Nested loop counter placement causes incorrect interface assignment
High Severity
The counter i is initialized to 1 before the inner loop but incremented inside it, causing the interface number to continue incrementing across all networks for each node instead of resetting. For example, if there are 2 networks, node1 gets eth1 and eth2, but node2 would incorrectly get eth3 and eth4 instead of eth1 and eth2. This breaks network connectivity between nodes.
Reviewed by Cursor Bugbot for commit ec6c54f. Configure here.
ec6c54f to
a8f0913
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 4 total unresolved issues (including 3 from previous reviews).
Reviewed by Cursor Bugbot for commit a8f0913. Configure here.
| {{- end }} | ||
| nodeSelector: | ||
| kubernetes.io/os: linux | ||
| kubernetes.io/arch: amd64 |
There was a problem hiding this comment.
Flow aggregator removed architecture-specific node selector
Low Severity
The deployment removed the kubernetes.io/arch: amd64 node selector while keeping kubernetes.io/os: linux. This change allows the flow-aggregator to schedule on non-amd64 architectures (arm64, arm), but the workflow changes suggest multi-arch images are being built. However, without verifying that all dependencies and features work correctly on non-amd64 platforms, this could lead to runtime failures.
Reviewed by Cursor Bugbot for commit a8f0913. Configure here.
28505be to
ee65d01
Compare
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
32cb1e4 to
7a1a8bc
Compare
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Add the Multi-cluster tests to GitHub Actions and update related scripts. Signed-off-by: Shuyang Xin <xin_shuyang@hotmail.com>
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Introduce pkg/agent/antreanodeconfig: - Match AntreaNodeConfig objects to a Node via nodeSelector; pick the oldest match (creationTimestamp, name as tiebreaker) and apply the first SecondaryNetwork winner only (no field-level merge). - Resolve the effective secondary-network OVS bridge from CRD list + static agent config (EffectiveSecondaryOVSBridge, EffectiveSnapshot). - Add a controller that watches AntreaNodeConfig and the local Node, recomputes the snapshot when labels or CRDs change, and notifies channel.Notifier subscribers (with periodic ANC resync). Add agent-facing SecondaryNetwork types under pkg/agent/types. Set the AntreaNodeConfig feature gate to Beta (default on). Refresh the agent chart, bundled install YAMLs, and feature-gate tests. Signed-off-by: Lan Luo <lan.luo@broadcom.com>
When the AntreaNodeConfig feature gate is enabled, antrea-agent starts the antreanodeconfig controller plus a SubscribableChannel and passes an effective-bridge callback and channel subscriber into the secondary network controller. The secondary network controller creates the initial OVS bridge from that callback, subscribes to ANC snapshot notifications to enqueue rate-limited bridge reconciliation work, and replaces the podwatch OVS client when the effective bridge changes. - Reconcile bridge name, physical interfaces, and trunk AllowedVLANs on Linux (including clearing stale trunks and tearing down stale host-connection port pairs when moving from single- to multi-interface uplink configs). - Add OVS client support for trunk ports (CreateTrunkPort, SetPortTrunks) and trunk parsing in port listings; extend mocks and tests. - Make podwatch PodController bridge access concurrency-safe and add UpdateOVSBridge for dynamic bridge swaps. - Add OVSBridgeConfig helpers in pkg/agent/types; log uplink restore errors in agent_linux. Signed-off-by: Lan Luo <lan.luo@broadcom.com>
Signed-off-by: Lan Luo <lan.luo@broadcom.com>
Signed-off-by: Lan Luo <lan.luo@broadcom.com>
Signed-off-by: Lan Luo <lan.luo@broadcom.com>
Signed-off-by: Lan Luo <lan.luo@broadcom.com>
7a1a8bc to
73db071
Compare
Signed-off-by: Lan Luo <lan.luo@broadcom.com>
73db071 to
94815d4
Compare


Note
High Risk
Introduces a new beta feature with CRD changes, controller logic, and integration with secondary networks. The module path change to v2 affects the entire codebase. Changes to core agent initialization and secondary network bridge configuration could impact existing deployments.
Overview
Adds the AntreaNodeConfig CRD (beta feature, enabled by default) to enable per-Node configuration of Antrea agent settings via nodeSelector-based policies. This allows cluster administrators to apply different secondary network OVS bridge configurations to different Node pools.
Key changes:
AntreaNodeConfigwith nodeSelector matching and secondary network bridge configuration (bridge name, physical interfaces, VLAN filtering, multicast snooping).pkg/agent/antreanodeconfig/that watches AntreaNodeConfig resources and publishes immutable snapshots to subscribers (e.g., secondary network controller) when the effective configuration for a Node changes.antrea.io/antrea/v2) across all imports for the major version release.Reviewed by Cursor Bugbot for commit a8f0913. Bugbot is set up for automated code reviews on this repo. Configure here.