Fix gRPC broadcast latency and tune connection pool#3478
Conversation
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Deploying vald with
|
| Latest commit: |
326a4a2
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://8c7b037b.vald.pages.dev |
| Branch Preview URL: | https://fix-grpc-broadcast-latency-1.vald.pages.dev |
|
[CHATOPS:HELP] ChatOps commands.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds per-address singleflight synchronization to the gRPC client, refactors connection-pool health/refresh into async flows using gRPC connectivity states with optional metrics, increases default pool sizing and keepalive, updates gateway broadcast concurrency, adds benchmarks, expands CI triggers for singleflight, and bumps several Rust deps. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as gRPC Client
participant SF as singleflight (per-addr)
participant Pool as Connection Pool
participant Server as Remote gRPC Server
Client->>SF: Connect(addr) [group.Do]
SF->>Pool: ensure/create/upgrade pool
Pool->>Pool: inspect connectivity state (Ready/Idle/Connecting/TransientFailure/Shutdown)
alt Idle
Pool->>Pool: trigger async refresh/dial (errgroup)
Pool-->>SF: supply connection when ready
else Ready or Connecting
Pool-->>SF: return connection
else TransientFailure
Pool-->>SF: return last-resort conn and schedule refresh
else Shutdown
Pool->>Pool: async dial new conn
end
SF-->>Client: connection returned
Client->>Server: RPCs via chosen connection
Client->>SF: Disconnect(addr) [group.Do]
SF->>Pool: serialized teardown/replace
Pool->>Pool: close/replace slots
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3478 +/- ##
==========================================
+ Coverage 25.02% 26.31% +1.28%
==========================================
Files 544 571 +27
Lines 43800 50472 +6672
==========================================
+ Hits 10963 13283 +2320
- Misses 32108 36209 +4101
- Partials 729 980 +251 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/net/grpc/benchmark_test.go`:
- Around line 49-86: Move the setup of the gRPCClient, population of
client.conns (mockPool entries) and atomic.StoreUint64(&client.clientCount, ...)
so it runs before starting the benchmark timers; then call b.ReportAllocs() and
b.ResetTimer() once before the sub-benchmarks in BenchmarkBroadcast. Inside each
sub-benchmark replace the old for i := 0; i < b.N; i++ loops with the Go 1.24
style for b.Loop() loop, and keep the call to client.RangeConcurrent(...) and
its simulated latency inside that loop. Ensure you reference the
BenchmarkBroadcast function, the gRPCClient instance (client),
client.conns/mockPool population, and the RangeConcurrent call when making these
changes.
- Around line 36-47: The mockPool currently hands out zero-value grpc.ClientConn
structs (see mockPool.Do, mockPool.Get, mockPool.Reconnect) which violates gRPC
construction rules; replace those returns with real, properly-initialized
*grpc.ClientConn instances created by dialing a test server (use net.Listen +
grpc.NewServer or an in-process bufconn as done in pool_bench_test.go), update
the benchmark setup to start and tear down the test gRPC server and dial a
client connection once, and have mockPool methods return that *grpc.ClientConn
instead of &grpc.ClientConn{} so any later method calls on the connection are
safe.
In `@internal/net/grpc/option.go`:
- Line 43: The change increasing the connection pool to
WithConnectionPoolSize(10) can multiply persistent gRPC connections (e.g., 100
agents → 1,000 connections) and risk hitting OS file descriptor and memory
limits; update the configuration and docs to make the pool size safe and
configurable, validate/adjust WithIdleTimeout to reclaim idle connections, and
add a checklist/test to verify target nodes' ulimit -n and pod memory limits for
the expected scale. Specifically, ensure WithConnectionPoolSize is
documented/configurable (or reduced), tune WithIdleTimeout to a shorter reclaim
window, add runtime validation or warnings if computed total FDs may exceed
ulimit, and include guidance in deployment manifests about required pod memory
and ulimit values.
- Around line 43-44: The hardcoded client keepalive in WithKeepaliveParams
currently sets PermitWithoutStream=true which will cause servers (with default
permit_without_stream=false and min_time=10m) to GOAWAY; update the
WithKeepaliveParams usage in internal/net/grpc/option.go to use
PermitWithoutStream=false (or remove/disable the aggressive 30s keepalive) so
client pings won’t be sent without active streams, or alternatively coordinate
server-side config changes before landing; change the WithKeepaliveParams call
accordingly to prevent immediate GOAWAYs from Vald servers.
In `@internal/net/grpc/pool/pool_bench_test.go`:
- Around line 237-246: The benchmark currently runs measurement without
bracketing timers or reporting allocations; update the benchmark surrounding the
b.RunParallel(...) loop to match the other benchmarks by calling b.StopTimer(),
b.ReportAllocs(), b.ResetTimer(), and b.StartTimer() before entering the
b.RunParallel block (so pool setup and connection time are excluded) and call
b.StopTimer() immediately after the parallel loop; modify the same benchmark
that invokes p.Do(...) on *ClientConn to include these calls.
- Around line 212-216: Remove the stale development-commentary block that spans
the scratch notes referencing "pool_bench_test.go" and the
ListenAndServe/pool.New ideas (the comment lines between the existing test setup
and helper usage); simply delete those lines so the test file contains only
meaningful comments and code around the test setup (where ListenAndServe and
pool.New are used) to keep the test tidy.
In `@pkg/gateway/lb/service/gateway.go`:
- Line 91: The tracing span was started on fctx but GetAgentCount was called
with the stale ctx; change the call to use the traced context (call
g.GetAgentCount(fctx)) so the address lookup is attributed to the BroadCast
trace span, and ensure any other calls inside the same return expression use
fctx consistently (e.g., client.RangeConcurrent(fctx, g.GetAgentCount(fctx),
...)).
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/net/grpc/pool/pool.go (2)
348-350: 🧹 Nitpick | 🔵 TrivialDead code:
count == 0is unreachable after thesz == 0guardThe early-return at lines 328–330 guarantees
sz > 0before the loop. Since every iteration either incrementscount(in thedefaultbranch) or returns early viactx.Done(),countis always ≥ 1 when execution reaches line 348. The block can never trigger.🧹 Proposed cleanup
} - if count == 0 { - return errors.ErrGRPCPoolConnectionNotFound - } return nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/net/grpc/pool/pool.go` around lines 348 - 350, Remove the unreachable dead branch that checks `if count == 0 { return errors.ErrGRPCPoolConnectionNotFound }` in the function that iterates over the pool (the loop guarded by `sz == 0` and using `count` and `ctx.Done()`). Because `sz > 0` is ensured before the loop and every loop path either increments `count` or returns on `ctx.Done()`, delete this `count == 0` check to clean up dead code and leave the remaining return paths intact.
287-300:⚠️ Potential issue | 🟡 MinorRemove
load()method — confirmed unused and contains bounds-checking bugThe method is dead code: no active calls exist in the codebase, and its test is commented out. More critically, the bounds check at line 294 is inverted:
if sz < idxshould beif idx >= sz. If that condition evaluates true, line 295 executesslots[sz].Load()wheresz == len(slots), causing an out-of-bounds panic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/net/grpc/pool/pool.go` around lines 287 - 300, Remove the dead and unsafe method pool.load: it is unused and contains an inverted bounds check that can index slots[len(slots)] causing an OOB panic; delete the load(idx uint64) (ridx uint64, pc *poolConn) implementation from the pool type and any commented tests referencing it, and run tests to confirm nothing else depends on the function name load in type pool (search for method name "load" and type "pool" to ensure no remaining callers).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/net/grpc/pool/pool_bench_test.go`:
- Around line 211-213: Replace the b.Fatal(err) call after p.Connect(ctx) with
b.Error(err) so the benchmark follows the file-wide convention used in
Benchmark_ConnPool, BenchmarkParallel_ConnPool and ListenAndServe; keep the
p.Connect(ctx) error path but use b.Error(err) (and ensure any subsequent code
remains safe if p is nil).
---
Outside diff comments:
In `@internal/net/grpc/pool/pool.go`:
- Around line 348-350: Remove the unreachable dead branch that checks `if count
== 0 { return errors.ErrGRPCPoolConnectionNotFound }` in the function that
iterates over the pool (the loop guarded by `sz == 0` and using `count` and
`ctx.Done()`). Because `sz > 0` is ensured before the loop and every loop path
either increments `count` or returns on `ctx.Done()`, delete this `count == 0`
check to clean up dead code and leave the remaining return paths intact.
- Around line 287-300: Remove the dead and unsafe method pool.load: it is unused
and contains an inverted bounds check that can index slots[len(slots)] causing
an OOB panic; delete the load(idx uint64) (ridx uint64, pc *poolConn)
implementation from the pool type and any commented tests referencing it, and
run tests to confirm nothing else depends on the function name load in type pool
(search for method name "load" and type "pool" to ensure no remaining callers).
---
Duplicate comments:
In `@internal/net/grpc/pool/pool_bench_test.go`:
- Around line 216-226: The benchmark BenchmarkPool_HighContention is missing the
standard measurement bracketing and is including setup latency and allocations;
before entering the parallel loop around p.Do(...) call stop the timer and call
b.ReportAllocs() then call b.StartTimer(), and after the parallel RunParallel
returns call b.StopTimer(); ensure you place these calls in the same pattern
used by the other benchmarks in this file so the allocation and setup overhead
are excluded from the measured loop.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/net/grpc/e2e_bench_test.go`:
- Around line 144-146: The code currently logs a non-fatal warning when
connectivity is partial (the time.Since(start) > 5*time.Second branch) which
allows the benchmark to continue with fewer than numAgents; change this to fail
the benchmark immediately by replacing the warning log with a fatal failure (use
b.Fatalf or equivalent) so the benchmark halts when connected < numAgents after
the timeout; update the branch that references start, connected, and numAgents
in e2e_bench_test.go to call b.Fatalf with a clear message including connected
and numAgents.
- Around line 88-101: startBenchServer currently calls b.Fatalf on error which
panics when invoked from goroutines; change startBenchServer to return (addr
string, td func(), err error) instead of calling b.Fatalf, update the goroutine
to capture the returned err into a shared errors slice (protected by setupMu)
along with addrs and teardowns, and after setupWg.Wait() check the collected
errors and call b.Fatalf (or b.Fatalf with combined error info) from the
benchmark goroutine if any error occurred; update any callers of
startBenchServer to handle the new error return.
- Line 178: The test currently calls client.Close(ctx) at the end (client.Close)
which leaks connections if a sub-benchmark panics; after creating the client
(the variable named client in e2e_bench_test.go) immediately call defer
client.Close(ctx) so the connection is always cleaned up on panic or early
return, and remove the standalone client.Close(ctx) call at line 178.
- Around line 59-72: The error channel ech created for srv.ListenAndServe is
never read so startup failures are ignored; change the logic to drain/check ech
to detect bind/startup errors by launching a background goroutine that reads
from ech and records any error (or use a non-blocking select/read after the
sleep) and fail the test/benchmark if an error is received; also replace the
fragile time.Sleep with a readiness check or by waiting for either a successful
"started" signal or an error from ech (use the existing ctx/cancel and wg to
coordinate shutdown).
- Around line 83-84: The benchmark silently swallows port conflicts because it
uses hardcoded ports and discards srv.ListenAndServe() errors; fix by adding a
WithListener(server.New option) to accept a pre-bound net.Listener and a method
like BoundAddr()/Addr() on the server to retrieve the actual bound address, then
in the test do l, _ := net.Listen("tcp", ":0"), pass l via WithListener to
server.New, obtain the bound address via BoundAddr()/Addr() and use that for
agents instead of fixed ports; additionally capture srv.ListenAndServe() errors
by sending them on a startup error channel and fail fast (or log and abort) if
no successful start within the poll timeout instead of ignoring the error, or
alternatively use IsConnected() polling against the server address to ensure all
agents are reachable before proceeding.
7e7dd4a to
6fd320d
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/net/grpc/benchmark_test.go`:
- Around line 66-77: startBroadcastServer currently launches srv.ListenAndServe
in a goroutine and returns immediately, causing the benchmark to busy-spin
retrying connections; change startBroadcastServer to provide a readiness signal
and have the caller wait on it (e.g., reuse the existing ech channel: have
ListenAndServe send a ready/started value or close a dedicated ready channel
once listening) or use a sync.WaitGroup to block until the server is accepting
connections before returning; alternatively, add a small sleep/backoff in the
retry loop that uses addr to avoid tight CPU spinning—update references to
ListenAndServe, ech, startBroadcastServer, and the caller's retry loop to use
the chosen readiness mechanism.
- Around line 44-77: The code has a TOCTOU race: you call net.Listen to get a
free port (l), close it, then create a server that re-binds later (risking
collisions). Modify startBroadcastServer to avoid closing the pre-bound
listener: either (1) change server.New / server.ListenAndServe to accept a
net.Listener (pass l through a new option like server.WithListener or a
constructor overload) and register gRPC on that listener inside
WithGRPCRegisterar, or (2) use an in-memory bufconn listener for the test
(bufconn.Listen) and pass that into the server, or if server supports it use
server.WithHost("localhost") + server.WithPort(0) so the server itself binds an
ephemeral port; if none are possible add a short retry loop when binding. Update
code references: net.Listen (l), server.New, server.WithGRPCRegisterar,
srv.ListenAndServe, and srv.Shutdown to operate on the pre-bound listener
instead of closing l then rebinding.
- Around line 114-127: The retry loop that calls client.Connect in the goroutine
spins without any backoff and uses the standard sync.WaitGroup; change the
goroutine in the benchmark to (1) replace the stdlib WaitGroup with Vald's
internal/sync WaitGroup type (from package "internal/sync") used throughout the
repo, and (2) add a small backoff/sleep between Connect attempts inside the for
i := 0; i < 10; i++ loop (e.g., time.Sleep with a short duration or an
exponential/backoff strategy) so Connect is not hammered; keep the
atomic.AddInt64(&connectedCount, 1) and return behavior unchanged and ensure the
new WaitGroup variable (wg) is the internal/sync version so wg.Done() and
wg.Wait() continue to work.
In `@internal/net/grpc/pool/pool.go`:
- Around line 587-599: Remove the development scratch comments inside the dial
logic: delete the stray internal reasoning lines between the healthy check and
the state handling (the comments mentioning "But NewClient returns Idle
usually..." and "isHealthy returns true for Idle..."). Keep the conditional
behavior intact (use p.isHealthy(idx, conn), inspect conn.GetState() and the
connectivity.TransientFailure branch) but tidy the block so only concise,
production-ready comments remain or no comment at all.
- Around line 655-658: The Debugf call in the connectivity.TransientFailure case
of pool.go has four format verbs but only three arguments (pc.addr, idx+1,
state.String()); add the missing pool-size argument to match the third %d (e.g.,
use the pool size field or computed length such as pc.poolSize or len(pc.conns))
and ensure arguments are ordered as (pc.addr, idx+1, <pool-size>,
state.String()) so the format string and arguments align.
- Around line 665-681: The second-pass loop currently calls p.refreshConn
synchronously (which calls p.dial / grpc.NewClient) and can block request
latency; change the logic in the loop that iterates over slots (where load, pc,
pc.conn and connectivity states are inspected) to not synchronously call
refreshConn but instead schedule the refresh asynchronously via
p.errGroup.Go(func(){ _ = p.refreshConn(ctx, idx, pc, p.addr) }) and immediately
continue/return an error to the caller when no usable connection was found on
the first pass; keep the existing behavior of returning a TransientFailure pc as
a last resort, but ensure refresh attempts are backgrounded on p.errGroup and
the caller is not blocked on network dial timeouts.
- Around line 897-901: The review notes that treating
connectivity.TransientFailure as "healthy" is intentional but non-obvious;
update the code to document this design decision: add a clear comment near the
IsHealthy implementation (and the getHealthyConn function) explaining that
TransientFailure is considered healthy to allow gRPC's exponential backoff and
internal retries to recover, that getHealthyConn performs a two-pass selection
(preferring Ready/Idle/Connecting and only returning TransientFailure as a last
resort), and that Reconnect/Connect logic intentionally skips hard reconnects
for TransientFailure—reference the IsHealthy, getHealthyConn, Reconnect, and
Connect symbols so future readers understand the semantic shift from
"immediately usable" to "not permanently dead."
---
Duplicate comments:
In `@internal/net/grpc/benchmark_test.go`:
- Around line 132-141: The benchmark currently swallows errors and uses a manual
for-loop; update the loop to use b.Loop() (as in pool_bench_test.go) and surface
failures instead of ignoring them: inside the RangeConcurrent call handler (and
after it) replace the commented-out b.Fatalf with either b.Fatalf to fail loudly
on error or b.Error to record the failure while continuing, and ensure the outer
loop uses for b.Loop() rather than for i := 0; i < b.N; i++ so the benchmark
reports meaningful results; reference RangeConcurrent, b.Fatalf/b.Error, and
b.Loop when making the changes.
In `@internal/net/grpc/option.go`:
- Around line 43-44: The client keepalive settings used with WithKeepaliveParams
(ping interval "30s" and PermitWithoutStream true) violate the Vald server
enforcement (min_time: 10m, permit_without_stream: false) causing GOAWAYs;
update the grpc client options used in option.go (e.g., the call site
constructing WithKeepaliveParams and WithConnectionPoolSize) to set
PermitWithoutStream to false and increase the ping interval to at least the
server min_time (e.g., >=10m), or alternatively coordinate and document that
server enforcement vars (min_time, permit_without_stream) will be changed across
Vald before landing this PR.
In `@internal/net/grpc/pool/pool_bench_test.go`:
- Around line 211-246: Remove the stale scratch comments at the start of
BenchmarkPool_HighContention (they add no value), and add benchmark hygiene:
call b.ReportAllocs() before starting measurements, wrap setup in b.StopTimer()
and b.StartTimer() so pool creation/Connect (New, Connect, ListenAndServe) are
excluded from timing, then ensure you call b.StopTimer() after the b.RunParallel
measurement completes; keep the test logic the same.
In `@pkg/gateway/lb/service/gateway.go`:
- Line 91: Call GetAgentCount using the tracing context fctx (not the original
ctx) so the span started on fctx is preserved; change the invocation in
gateway.go from g.GetAgentCount(ctx) to g.GetAgentCount(fctx) so RangeConcurrent
and GetAgentCount share the same trace context and maintain consistent trace
attribution.
4a0c245 to
9a7f31a
Compare
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
9a7f31a to
c1f0a9f
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/net/grpc/benchmark_test.go`:
- Around line 132-140: The benchmark currently ignores errors returned by
client.RangeConcurrent which can hide RPC failures; update the benchmark loop in
benchmark_test.go to check the err from client.RangeConcurrent and call b.Fatalf
(or b.Error/b.FailNow) with a descriptive message including err to fail the
benchmark on RPC errors; also ensure the anonymous function passed to
RangeConcurrent returns any inner RPC error (from c.Nodes) so the outer err
reflects failures from the inner call.
- Line 142: Add the required unit-test delimiter by inserting the comment line
"// NOT IMPLEMENTED BELOW" immediately after the b.StopTimer() call in
benchmark_test.go (the block containing the b.StopTimer() statement); this will
separate implemented tests from unimplemented ones per repo guidelines.
In `@internal/net/grpc/client.go`:
- Around line 1047-1098: When swapping in a newly created pool in the code that
calls pool.New/conn.Connect, capture any existing connection from
g.conns.Load(addr) before replacing it; after the new conn is connected and
health-checked, if an oldConn existed call g.Disconnect(ctx, addr) (join any
disconnect error into the returned error or log it) and do not increment
g.clientCount; only atomic.AddUint64(&g.clientCount, 1) when there was no
previous connection for that addr. Ensure g.conns.Store(addr, conn) happens
after the successful swap so the new conn replaces the old one and avoid leaking
the old pool.
In `@internal/net/grpc/pool/pool_bench_test.go`:
- Line 247: Add the required placeholder by inserting a single line containing
"// NOT IMPLEMENTED BELOW" on its own line in pool_bench_test.go immediately
before the file end (i.e., after the last implemented test / before the final
closing brace or trailing newline) so the test file includes the delimiter
separating implemented tests from unimplemented ones.
---
Duplicate comments:
In `@internal/net/grpc/benchmark_test.go`:
- Around line 43-50: The test suffers a TOCTOU race by listening on
"localhost:0", closing the listener, then re-binding later in
startBroadcastServer; change startBroadcastServer to accept a pre-bound
net.Listener (or create and use a bufconn listener) so the server uses the same
listener instead of re-binding, or modify startBroadcastServer to let the server
bind to port 0 itself and retrieve the actual port from the server's listener;
update call sites that currently expect a string port to instead provide the
listener (or adapt to the server-returned address) and keep references to
startBroadcastServer and any errgroup usage consistent.
- Around line 110-126: The retry loop for connecting to agents currently spins
tightly and uses the stdlib sync.WaitGroup; replace this with Vald's internal
sync utilities (e.g., internal/sync or internal/sync/errgroup) and add a short
backoff between retries to avoid busy looping: in the goroutine that calls
client.Connect(ctx, a) for each addr in addrs, use the internal errgroup or
waitgroup equivalent instead of wg, and on each failed attempt sleep for a small
duration (e.g., a few milliseconds) before retrying; ensure connectedCount
(atomic.AddInt64) is still incremented on success and that the new sync helper
is used to wait for all goroutines to finish.
In `@internal/net/grpc/option.go`:
- Around line 42-44: The default keepalive options currently set in
defaultOptions use WithKeepaliveParams("30s", "10s", true) which enables
PermitWithoutStream and can trigger server GOAWAY when servers enforce
min_time/permit_without_stream; change the default to disable
PermitWithoutStream (set the third argument to false) or replace the default
WithKeepaliveParams call with a safer value (e.g.,
WithKeepaliveParams("30s","10s", false)) so PermitWithoutStream is not enabled
by default; update any related comments and ensure WithConnectionPoolSize(10)
remains unchanged.
In `@internal/net/grpc/pool/pool_bench_test.go`:
- Around line 211-216: Remove the development scratch comments inside the
BenchmarkPool_HighContention function: delete the multi-line notes about mock
server, dialer, and ListenAndServe so the benchmark contains only the actual
test setup and benchmark code; keep the function signature
BenchmarkPool_HighContention and any real setup/helper calls, but remove all
commented-out planning text to clean up the test.
- Around line 237-246: The benchmark in pool_bench_test.go is measuring setup
time/allocs because timers aren't bracketed: wrap the setup work with
b.StopTimer()/b.StartTimer() like the other benchmarks so only the p.Do loop is
measured; specifically call b.StopTimer() before any setup done prior to
b.RunParallel (if any), perform setup, then call b.StartTimer() immediately
before b.RunParallel that invokes p.Do and ClientConn, ensuring the
b.ResetTimer() remains where appropriate to clear prior measurements.
In `@internal/net/grpc/pool/pool.go`:
- Around line 587-599: Remove the informal "development scratch" comments inside
the dial logic around p.isHealthy, NewClient and the state handling (the
multi-line notes about Idle, TransientFailure, Shutdown, and "we might want to
keep it"); replace them with a short, descriptive comment (or none) that clearly
states intent—for example: "If connection is in TransientFailure, retain and let
it retry; otherwise close." Ensure references to conn.GetState(),
connectivity.TransientFailure and p.isHealthy(idx, conn) remain untouched
functionally, only the explanatory comments are cleaned up.
|
/format |
|
[FORMAT] Updating license headers and formatting go codes triggered by kmrmt. |
b1a485a to
59b91f6
Compare
|
@copilot Can you fix like below? Great suggestion! Adding a reproducible CI container (custom runner image) ensures your build environment is consistent and includes all required tools (like gzip/pigz), reducing platform-related errors. Here’s how you can do it for your workflow in 1. Create a Dockerfile for your CI containerPlace this file under FROM ubuntu:22.04
# Install required build tools
RUN apt-get update && \
apt-get install -y \
build-essential \
git \
pigz \
curl \
ca-certificates \
docker.io \
# add any additional tools you need
# Clean up
&& apt-get clean && rm -rf /var/lib/apt/lists/*Adapt the list based on what your build needs (e.g., other language runtimes/tools). 2. Update your GitHub Actions workflowIn jobs:
build-agent-ngt:
runs-on: ubuntu-latest
container:
image: ghcr.io/vdaas/vald/ci-agent:latest
options: --privileged # If you need nested Docker
steps:
- uses: actions/checkout@v4
- name: Prune Docker Build Cache
run: docker builder prune --force
- name: Build Agent NGT Image
run: docker build --no-cache -t agent-ngt .3. Build and push your CI container to GitHub Container Registry
4. Fixing the build errorsWith this dedicated CI container, you ensure tools like pigz, gzip, docker, etc. are always available and up-to-date. If you need specific versions or additional tools, just add them to your Dockerfile. Summary: Let me know if you want a PR to add these files and workflow updates! |
6582777 to
a2a25af
Compare
a2a25af to
f5bc302
Compare
|
/format |
|
[FORMAT] Updating license headers and formatting go codes triggered by kpango. |
There was a problem hiding this comment.
Pull request overview
This PR updates CI/dependency/tooling versions and introduces several gRPC connection-pool and server shutdown changes aimed at reducing broadcast latency, improving pool behavior, and adding optional pool metrics/config knobs.
Changes:
- Bump pinned GitHub Actions/tool versions and environment version references (Go/Rust/Docker/K8s/Helm/etc.).
- Update gRPC client/pool behavior (pool sizing defaults, keepalive defaults, async refresh, metrics toggle) and improve server shutdown to close listeners.
- Add/extend tests and benchmarks (gRPC pool/getHealthyConn, broadcast benchmarks, cgroup stats determinism) and introduce Rust
metaconfig loading/CLI args.
Reviewed changes
Copilot reviewed 140 out of 148 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| versions/actions/SOFTPROPS_ACTION_GH_RELEASE | Bump pinned GitHub Action version. |
| versions/actions/SHOGO82148_ACTIONS_UPLOAD_RELEASE_ASSET | Bump pinned GitHub Action version. |
| versions/actions/GITHUB_ISSUE_METRICS | Bump pinned GitHub Action version. |
| versions/actions/GITHUB_CODEQL_ACTION_UPLOAD_SARIF | Bump pinned GitHub Action version. |
| versions/actions/GITHUB_CODEQL_ACTION_INIT | Bump pinned GitHub Action version. |
| versions/actions/GITHUB_CODEQL_ACTION_AUTOBUILD | Bump pinned GitHub Action version. |
| versions/actions/GITHUB_CODEQL_ACTION_ANALYZE | Bump pinned GitHub Action version. |
| versions/actions/DOCKER_SETUP_BUILDX_ACTION | Bump pinned GitHub Action version. |
| versions/actions/DOCKER_LOGIN_ACTION | Bump pinned GitHub Action version. |
| versions/actions/CRAZY_MAX_GHACTION_IMPORT_GPG | Bump pinned GitHub Action version. |
| versions/actions/CODECOV_CODECOV_ACTION | Bump pinned GitHub Action version. |
| versions/actions/AZURE_SETUP_HELM | Bump pinned GitHub Action version. |
| versions/actions/ACTIONS_UPLOAD_ARTIFACT | Bump pinned GitHub Action version. |
| versions/actions/ACTIONS_SETUP_GO | Bump pinned GitHub Action version. |
| versions/actions/ACTIONS_DOWNLOAD_ARTIFACT | Bump pinned GitHub Action version. |
| versions/actions/ACTIONS_CACHE | Bump pinned GitHub Action version. |
| versions/ZLIB_VERSION | Update bundled tool/library version pin. |
| versions/YQ_VERSION | Update tool version pin. |
| versions/TELEPRESENCE_VERSION | Update tool version pin. |
| versions/RUST_VERSION | Update Rust version pin. |
| versions/PROTOBUF_VERSION | Update tool version pin. |
| versions/PROMETHEUS_STACK_VERSION | Update Helm chart version pin. |
| versions/NGT_VERSION | Update NGT version pin. |
| versions/KUBELINTER_VERSION | Update tool version pin. |
| versions/KUBECTL_VERSION | Update tool version pin. |
| versions/K3S_VERSION | Update tool version pin. |
| versions/HELM_VERSION | Update tool version pin. |
| versions/HDF5_VERSION | Update library version pin. |
| versions/GO_VERSION | Update Go version pin. |
| versions/GOLANGCILINT_VERSION | Update linter version pin. |
| versions/FAISS_VERSION | Update Faiss version pin. |
| versions/DOCKER_VERSION | Update Docker version pin. |
| versions/CMAKE_VERSION | Update CMake version pin. |
| versions/CHAOS_MESH_VERSION | Update Chaos Mesh version pin. |
| versions/BUF_VERSION | Update Buf version pin. |
| rust/rust-toolchain.toml | Bump Rust toolchain channel. |
| rust/libs/vqueue/Cargo.toml | Bump Rust dependency versions. |
| rust/libs/observability/Cargo.toml | Bump Rust dependency versions. |
| rust/libs/kvs/Cargo.toml | Bump Rust dependency versions. |
| rust/libs/algorithms/qbg/build.rs | Add ThinLTO flag to C++ build step. |
| rust/libs/algorithms/ngt/build.rs | Add ThinLTO flag to C++ build step. |
| rust/bin/meta/src/main.rs | Add CLI args + config loading/validation for meta binary. |
| rust/bin/meta/src/config.rs | New YAML config + validation + observability conversion. |
| rust/bin/meta/Cargo.toml | Add config/CLI deps and dev-deps for tests. |
| rust/bin/agent/Cargo.toml | Bump Rust dependency versions. |
| rust/Cargo.toml | Enable ThinLTO + codegen-units tuning for release builds. |
| pkg/agent/core/ngt/handler/grpc/insert.go | Prefer Vector.Timestamp over Config.Timestamp for inserts. |
| pkg/agent/core/ngt/handler/grpc/index_test.go | Update tests to account for timestamp migration. |
| k8s/operator/helm/crds/valdrelease.yaml | Add enable_metrics schema field for connection pool. |
| k8s/manager/index/deployment.yaml | Update config checksum annotation. |
| k8s/manager/index/configmap.yaml | Add enable_metrics to connection_pool config. |
| k8s/index/job/save/configmap.yaml | Add enable_metrics to connection_pool config. |
| k8s/index/job/deletion/configmap.yaml | Add enable_metrics to connection_pool config. |
| k8s/index/job/creation/configmap.yaml | Add enable_metrics to connection_pool config. |
| k8s/index/job/correction/configmap.yaml | Add enable_metrics to connection_pool config. |
| k8s/gateway/gateway/mirror/deployment.yaml | Update config checksum annotation. |
| k8s/gateway/gateway/mirror/configmap.yaml | Add enable_metrics to connection_pool config. |
| k8s/gateway/gateway/lb/deployment.yaml | Update config checksum annotation. |
| k8s/gateway/gateway/lb/configmap.yaml | Add enable_metrics to connection_pool config. |
| internal/servers/server/server.go | Track listener and close it during Shutdown. |
| internal/net/grpc/stats/stats_test.go | Make cgroup tests deterministic via temp dirs; avoid parallel. |
| internal/net/grpc/stats/stats.go | Add proc path override + better cgroup v2 path detection. |
| internal/net/grpc/server_test.go | Adjust comparison options for unexported fields. |
| internal/net/grpc/pool/pool_bench_test.go | Add high-contention pool benchmark. |
| internal/net/grpc/pool/pool_bench_gethealthy_test.go | Add getHealthyConn benchmarks (serial + parallel). |
| internal/net/grpc/pool/pool.go | Rework pool selection/refresh/health + add metrics toggle. |
| internal/net/grpc/pool/option.go | Add pool option to enable/disable metrics. |
| internal/net/grpc/option.go | Change default pool size and add default keepalive params + metrics option. |
| internal/net/grpc/errdetails/errdetails_test.go | Add comprehensive tests for detail decoding/serialization. |
| internal/net/grpc/client.go | Add singleflight around connect/disconnect + pool metrics wiring. |
| internal/net/grpc/benchmark_test.go | New benchmarks for broadcast and RPC execution paths. |
| internal/core/algorithm/ngt/doc.go | Update NGT upstream link. |
| internal/core/algorithm/faiss/Capi.cpp | Add missing include (sstream). |
| internal/config/grpc_test.go | Update expected options due to new connection-pool metrics option. |
| internal/config/grpc.go | Add connection_pool.enable_metrics config and wire to grpc options. |
| hack/helm/schema/crd/main.go | Optimize indent function string building. |
| hack/go.mod.default | Update Go directive version. |
| hack/docker/gen/main.go | Add clang+lld LTO path and adjust deps/packages. |
| hack/benchmark/internal/core/algorithm/ngt/ngt.go | Update NGT search call signature. |
| example/client/go.sum | Update dependency checksums. |
| example/client/go.mod.default | Update Go directive version. |
| example/client/go.mod | Update Go directive + dependency versions. |
| docs/user-guides/configuration.md | Update NGT upstream links. |
| docs/usecase/usage-example.md | Update NGT upstream links. |
| docs/tutorial/vald-agent-standalone-on-k8s.md | Update NGT upstream links. |
| docs/performance/tuning-search-performance.md | Update NGT upstream links in tables. |
| docs/overview/component/agent.md | Update NGT upstream links. |
| docs/overview/component/README.md | Update NGT upstream links. |
| docs/overview/architecture.md | Update NGT upstream links. |
| docs/overview/about-vald.md | Update NGT upstream links. |
| dockers/dev/Dockerfile | Swap pkg-config->pkgconf; add pigz. |
| dockers/agent/core/agent/Dockerfile | Add clang+lld, RUSTFLAGS, and ThinLTO build path. |
| cmd/tools/cli/benchmark/core/doc.go | Update NGT upstream link. |
| charts/vald/values.yaml | Add enable_metrics field to connection_pool defaults. |
| charts/vald/values.schema.json | Add schema for enable_metrics and update NGT links. |
| charts/vald-with-tikv/values.yaml | Update NGT links and adjust filename_suffix quoting. |
| charts/vald-helm-operator/crds/valdrelease.yaml | Add enable_metrics schema field for connection pool. |
| README.md | Update NGT upstream link. |
| Makefile.d/tools.mk | Fix rustfmt install dependency ordering. |
| Makefile.d/docker.mk | Build docker images with --network host. |
| Makefile.d/dependencies.mk | Update NGT release lookup repo URL. |
| Makefile | Update NGT repo and add NGT_EXTRA_CMAKE_FLAGS; run go fix in format target. |
| .gitignore | Ignore generated /gen directory. |
| .github/workflows/update-pull-request-and-issue-template.yaml | Bump action versions. |
| .github/workflows/update-deps.yaml | Bump action versions. |
| .github/workflows/reviewdog-markdown.yaml | Exclude an additional README from reviewdog patterns. |
| .github/workflows/release.yaml | Bump action versions. |
| .github/workflows/issue-metrics.yaml | Bump action versions. |
| .github/workflows/helm.yaml | Bump action versions. |
| .github/workflows/github-actions-workflow-linter.yaml | Bump setup-go action version. |
| .github/workflows/format.yaml | Bump action versions. |
| .github/workflows/e2e-profiling.yaml | Bump artifact/cache actions. |
| .github/workflows/e2e-code-bench-agent.yaml | Bump artifact action version. |
| .github/workflows/dockers-readreplica-rotate-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-manager-index-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-index-save-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-index-operator-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-index-exportation-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-index-deletion-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-index-creation-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-index-correction-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-gateway-mirror-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-gateway-lb-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-gateway-filter-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-benchmark-operator-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-benchmark-job-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-agent-sidecar-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/dockers-agent-faiss-image.yaml | Add singleflight to docker build trigger paths. |
| .github/workflows/coverage.yaml | Bump codecov action version. |
| .github/workflows/codeql-analysis.yml | Bump codeql actions. |
| .github/workflows/chatops.yaml | Bump gpg import action version. |
| .github/workflows/build-binaries.yaml | Bump artifact/download actions and upload-release-asset action. |
| .github/workflows/backport.yaml | Bump gpg import action version. |
| .github/workflows/_release-pr.yaml | Bump gpg import action version. |
| .github/workflows/_docker-image.yaml | Bump docker login & buildx actions. |
| .github/actions/setup-helm/action.yaml | Bump azure/setup-helm action version. |
| .github/actions/setup-go/action.yaml | Bump actions/setup-go action version. |
| .github/actions/scan-docker-image/action.yaml | Bump codeql upload-sarif action version. |
| .github/actions/e2e-profile/action.yaml | Bump artifact/cache actions. |
| .github/PULL_REQUEST_TEMPLATE.md | Update environment version hints. |
| .github/ISSUE_TEMPLATE/security_issue_report.md | Update environment version hints. |
| .github/ISSUE_TEMPLATE/bug_report.md | Update environment version hints. |
| .gitfiles | Add new benchmark/config paths to file list. |
Comments suppressed due to low confidence (9)
internal/servers/server/server.go:1
- Shutdown collects listener close errors into
rerrbut always returnsnil, which drops real shutdown failures. Returnrerr(and keep itnilwhen no errors) so callers can detect shutdown problems.
internal/net/grpc/pool/pool.go:1 - refreshConn now runs dials asynchronously and explicitly suppresses dial errors (
return nil // do not propagate), which can makeConnect()succeed even when all connections failed to dial. Consider tracking dial failures (e.g., atomic failure counter / first error capture) and haveconnect()/singleTargetConnect()return an error when no slot becomes usable within the connect call context.
internal/net/grpc/pool/pool.go:1 TransientFailureis treated as healthy, which can causeIsHealthy()(and callers relying on it) to report a pool as healthy even when RPCs are likely to fail or be backed off. If you want to keep the conn for retry/backoff, keep it open, but consider reportingTransientFailureas unhealthy (or split 'usable' vs 'healthy' semantics).
internal/net/grpc/pool/pool.go:1- This loop iterates up to
p.poolSize.Load()but indexes intoslotswith special-casing whenidx >= slen(falling back to the last slot). IfpoolSizecan ever exceedlen(slots)during resize/update windows, this will repeatedly operate on the last slot and skew health/refresh behavior. Safer approach: computelimit := min(p.poolSize.Load(), uint64(len(slots)))(or ensure slots length is always == pool size) and iterate onlylimitelements.
internal/net/grpc/pool/pool_bench_gethealthy_test.go:1 - These benchmarks create and connect a pool but never disconnect/close it. Add a cleanup (e.g., defer
p.Disconnect(ctx)or equivalent) so repeated benchmark runs don’t leak goroutines/connections and skew results.
internal/net/grpc/pool/pool_bench_test.go:1 - BenchmarkPool_HighContention connects a pool but does not disconnect it. Please add cleanup to close connections after the benchmark to avoid leaking resources and contaminating subsequent benchmarks.
Makefile.d/docker.mk:1 - Adding
--network hostto all local Docker builds increases risk (build steps can access host network services) and can reduce build reproducibility. Consider gating this behind an opt-in flag (e.g.,DOCKER_BUILD_NETWORK ?= defaultand only add--network hostwhen explicitly requested).
hack/helm/schema/crd/main.go:1 - This returns
result.String()twice and slices using a length computed from a separateresult.String()call. It’s inefficient and can panic when the accumulated string is empty (len == 0). Store the string once in a local variable and only slice when length > 0 (or avoid slicing by writing conditionally).
internal/net/grpc/pool/pool.go:1 - The PR description emphasizes dependency/workflow updates, but this PR also includes substantial behavioral changes to gRPC pooling/client connection logic (async refresh, health semantics, defaults, metrics). Consider updating the PR description to explicitly call out these runtime behavior changes (and any rollout/compatibility notes) so reviewers/operators don’t miss them.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/format |
|
[FORMAT] Updating license headers and formatting go codes triggered by kpango. |
|
/format |
|
[FORMAT] Updating license headers and formatting go codes triggered by kpango. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 143 out of 166 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (11)
internal/servers/server/server.go:1
Shutdownaccumulates errors intorerrbut always returnsnil. This will hide listener-close failures (and any joined errors). Returnrerrinstead ofnilso callers can observe shutdown errors.
internal/net/grpc/pool/pool.go:1refreshConnswallows dial errors by returningnilfrom the goroutine even whenp.dialfails and there is no healthy existing connection. This can causeConnect()to report success while the pool has no usable connections. Consider propagating dial errors (or at least returning an error when all refreshes fail / no healthy conns exist) so callers can react appropriately.
internal/net/grpc/pool/pool.go:1refreshConnswallows dial errors by returningnilfrom the goroutine even whenp.dialfails and there is no healthy existing connection. This can causeConnect()to report success while the pool has no usable connections. Consider propagating dial errors (or at least returning an error when all refreshes fail / no healthy conns exist) so callers can react appropriately.
internal/net/grpc/pool/pool.go:1TransientFailureis treated as healthy here. In gRPC,TransientFailuregenerally indicates the connection is not ready for RPCs and will often immediately fail calls, which can increase observed latency. If the intent is to keep a failing conn around for backoff retries, consider distinguishing 'keep for retry' vs 'healthy for use': returnfalseforTransientFailuresoGet/IsHealthycan avoid selecting it while still optionally triggering refresh/reconnect logic.
internal/net/grpc/stats/stats.go:1subPathparsed from/proc/self/cgrouptypically starts with/(e.g./system.slice/...). Iffile.Joinfollowsfilepath.Joinsemantics, joiningcgroupBasePathwith an absolutesubPathwill discardcgroupBasePathand produce the wrong candidate path. Trim the leading slash (e.g.strings.TrimPrefix(subPath, \"/\")) before joining so the candidate resolves undercgroupBasePath.
hack/helm/schema/crd/main.go:1- This implementation repeatedly calls
indent.String()and also callsresult.String()twice on the return line, which creates avoidable allocations. Also,result.String()[:len(...)-1]is fragile and harder to read. Prefer computingindentStr := indent.String()once, and computeout := result.String()once before slicing/returning.
hack/helm/schema/crd/main.go:1 - This implementation repeatedly calls
indent.String()and also callsresult.String()twice on the return line, which creates avoidable allocations. Also,result.String()[:len(...)-1]is fragile and harder to read. Prefer computingindentStr := indent.String()once, and computeout := result.String()once before slicing/returning.
hack/docker/gen/main.go:1 - The
RUSTFLAGSvalue here includes literal quotes (\"...\"). When exported into the container environment, those quotes may become part of the variable value and break rustc flag parsing. Store the raw flags without wrapping quotes so the environment variable is exactly-Clinker=clang -Clink-arg=-fuse-ld=lld.
Makefile.d/docker.mk:1 - Adding
--network hostreduces build isolation (the Dockerfile build steps can access host services and networks) and can behave differently across environments/runtimes. Consider making this conditional (e.g., behind a Make variable defaulting to off) so users/CI can opt in only when necessary.
Makefile:1 go fixcan perform code rewrites (not just formatting) and may introduce unexpected diffs when developers runmake format/go. Consider movinggo fixto a separate explicit target (or gating it behind a variable) to keep formatting deterministic and narrowly scoped.
Makefile:1go fixcan perform code rewrites (not just formatting) and may introduce unexpected diffs when developers runmake format/go. Consider movinggo fixto a separate explicit target (or gating it behind a variable) to keep formatting deterministic and narrowly scoped.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/format |
|
[FORMAT] Updating license headers and formatting go codes triggered by kpango. |
Signed-off-by: kpango <kpango@vdaas.org>
| | | description | how does it affect? | Memo | | ||
| | :--------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------ | | ||
| | agent.ngt.creation_edge_size | Number of nodes connected to one node<BR>[ref: no_of_edges](https://github.com/NGT-labs/NGT/tree/main/bin/ngt#create) | It helps reduce unreachable edges.<BR>The larger it is, the denser the graph structure will be, but the memory usage, search speed, and index construction time will increase accordingly. | default value: `20` | | ||
| | agent.ngt.search_edge_size | Number of nodes to search from the origin node when searching<BR>[ref: no_of_edges_at_search_at](https://github.com/NGT-labs/NGT/tree/main/bin/ngt#create) | The number of nodes to search will increase.<BR>Accuracy will be higher, but speed will be lower.<BR>Adjust if adjusting the radius and epsilon does not improve the situation. | default value: `10` | |
There was a problem hiding this comment.
[LanguageTool] reported by reviewdog 🐶
Did you mean “lowed”, “lowered”? (MUST_BE_DO[2])
Suggestions: lowed, lowered
Rule: https://community.languagetool.org/rule/show/MUST_BE_DO?lang=en-US&subId=2
Category: GRAMMAR
This pull request primarily updates dependencies and environment versions used in CI/CD workflows, issue templates, and adds new benchmark and configuration files to the codebase. The changes are focused on keeping the project up-to-date with the latest tools, improving test coverage, and ensuring accurate environment information.
Dependency and workflow updates:
actions/upload-artifact,actions/download-artifact,shogo82148/actions-upload-release-asset,crazy-max/ghaction-import-gpg, andgithub/codeql-actionacross multiple workflow files for improved reliability and security. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]Environment version updates:
Test and benchmark coverage improvements:
internal/net/grpc/benchmark_test.goandinternal/net/grpc/pool/pool_bench_gethealthy_test.goto enhance performance testing of gRPC components. [1] [2]File and configuration additions:
rust/bin/meta/src/config.rsto the Rust meta binary, likely for improved configuration management.Build trigger enhancements:
internal/sync/singleflight/*.goin the build triggers for various Docker image workflows, ensuring changes to this package are properly tracked and tested. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]