Skip to content

Negotiating ECE#54

Closed
chenBright wants to merge 1 commit into
masterfrom
ece
Closed

Negotiating ECE#54
chenBright wants to merge 1 commit into
masterfrom
ece

Conversation

@chenBright

@chenBright chenBright commented Jun 14, 2026

Copy link
Copy Markdown
Owner

What problem does this PR solve?

Issue Number: resolve

Problem Summary:

The RDMA endpoint already declares the intent to do ECE (Enhanced Connection Establishment) negotiation (BringUpQp is gated by FLAGS_rdma_ece), but the existing implementation never exchanges ECE capabilities between the two peers.

What is changed and the side effects?

Changed:

This PR implements an end-to-end ECE negotiation in the v3 handshake.

  • Client (RdmaHandshakeClientV3::SendLocalHello): query local ECE, store it in _outgoing_ece, advertise it in the v3 hello.
  • Server (BringUpQp): apply the client's ECE during INIT->RTR (ibv_set_ece). After QP reaches RTS, ibv_query_ece to obtain the reduced/negotiated ECE (the subset both peers support) and store it in _outgoing_ece.
  • Server (SendLocalHelloFillLocalRdmaHello): advertise the negotiated ECE in the reply hello.
  • Client (BringUpQp): apply the server's reduced ECE during INIT->RTR.

Side effects:

  • Performance effects:

  • Breaking backward compatibility:


Check List:

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds end-to-end Enhanced Connection Establishment (ECE) negotiation support to the RDMA v3 (“RDM3”) handshake by extending the protobuf hello message, soft-loading the necessary libibverbs APIs, and adding unit tests that validate the degrade-safe wire behavior under UT/no-hardware mode.

Changes:

  • Extend the v3 handshake protobuf (RdmaHello) to optionally carry an ECE capability block.
  • Soft-load ibv_query_ece / ibv_set_ece so RDMA init can succeed on older libibverbs (ECE disabled when unavailable).
  • Add UTs that ensure v3 hello round-trips with/without ECE and that server replies omit ECE in degrade cases.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/brpc_rdma_unittest.cpp Adds UT coverage for v3 hello ECE field presence/absence and degrade-safe server replies.
src/brpc/rdma/rdma_helper.cpp Makes ECE-related ibverbs symbols optional at runtime via soft-loading.
src/brpc/rdma/rdma_handshake.proto Adds optional RdmaEce message to the v3 hello protobuf schema.
src/brpc/rdma/rdma_handshake.h Extends parsed handshake state to carry optional ibv_ece for v3 peers.
src/brpc/rdma/rdma_handshake.cpp Implements client-side ECE capability query + hello advertisement and v3 hello encode/decode for ECE.
src/brpc/rdma/rdma_endpoint.h Adds per-endpoint storage for the next outgoing ECE payload to advertise.
src/brpc/rdma/rdma_endpoint.cpp Updates QP bring-up signature and adds server-side negotiated ECE query for reply hello.
.github/workflows/ci-linux.yml Runs the full Bazel test suite for the RDMA-configured job (removes prior filter).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/brpc/rdma/rdma_endpoint.cpp
Comment thread src/brpc/rdma/rdma_helper.cpp
Comment thread test/brpc_rdma_unittest.cpp
Comment thread test/brpc_rdma_unittest.cpp Outdated
Previously BringUpQp only did a local ibv_query_ece + ibv_set_ece
roundtrip and never exchanged ECE capabilities with the peer.
This patch wires up the standard requestor/responder ECE negotiation
flow on top of the existing v3 handshake without adding any extra
round trip:

1. Client queries local ECE, advertises it in its v3 hello.

2. Server applies the client's ECE in INIT->RTR (set_ece), then
   after RTS queries the reduced/negotiated ECE and sends it back
   in the reply hello.

3. Client applies the server's reduced ECE in INIT->RTR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants