Skip to content

Add load-testing CLI flags to transaction-bench#76

Draft
stablebits wants to merge 4 commits into
anza-xyz:masterfrom
stablebits:tx-bench-load-test-flags
Draft

Add load-testing CLI flags to transaction-bench#76
stablebits wants to merge 4 commits into
anza-xyz:masterfrom
stablebits:tx-bench-load-test-flags

Conversation

@stablebits

Copy link
Copy Markdown

Adds four independent enhancements to transaction-bench, aimed at driving and measuring high connection-count load tests against a single (typically pinned) leader. Each is a separate commit and works on its own.

--num-transactions

Optional cap on the total number of transactions scheduled, after which the generator stops. Can be combined with --duration — whichever limit is hit first ends the run. The generator tracks a running count and clamps the final batch so it
never overshoots the budget.

--initial-congestion-window

Overrides the QUIC initial congestion window (in bytes) passed to tpu-client-next as override_initial_congestion_window. A larger window skips TCP-style slow-start at connection startup so transactions go out at full rate immediately. Defaults
to tpu-client-next's built-in value (128 * PACKET_DATA_SIZE) when unset.

--clients-per-identity

Each tpu-client-next instance opens exactly one QUIC connection per leader (WorkersCache is keyed by leader SocketAddr, one worker = one connection). So the number of connections to a pinned leader equals the number of instances, and previously the only way to get N connections under one identity was to repeat --staked-identity-file N times — impractical for large N.

This flag (default 1) sets a per-identity multiplier:

total instances = max(1, num staked identities) * clients-per-identity

Identities are round-robined across instances (get(i % num_identities)), yielding None/unstaked when no identity file is given. Example — 1000 connections under one identity:

solana-transaction-bench -u $CLUSTER read-accounts-run
--staked-identity-file $STAKED_ID --clients-per-identity 1000
... pinned-leader-tracker $TARGET

--drain-seconds

Post-generation drain phase + detailed stats (--drain-seconds)

After the generator finishes, transactions can still be queued in tpu-client-next's worker channels and quinn send buffers. Tearing the schedulers down immediately drops them — most visible when --num-transactions is set. The drain phase
(default 10s, 0 to disable) waits until the aggregated successfully_sent count is stable across reporter ticks, then a short fixed tail to let quinn flush, all capped by the timeout.

To support this, report_aggregated_stats now also accumulates non-resetting totals (successfully_sent / congestion_events / write_error) that the drain phase observes, logs per-tick and final totals, and breaks down connection_error /
write_error by cause.

stablebits and others added 4 commits June 2, 2026 18:31
Adds an optional `--num-transactions` limit that stops the benchmark
after scheduling the requested number of transactions. May be combined
with `--duration`; whichever limit is reached first stops the run.

The generator tracks a running `txs_scheduled` count and clamps the
final batch so it never overshoots the budget.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds an optional `--initial-congestion-window` (bytes) that is forwarded
to tpu-client-next as `override_initial_congestion_window`. A larger
window skips TCP-style slow-start at connection startup so transactions
can be sent at full rate immediately. Defaults to tpu-client-next's
built-in value when unset.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Each tpu-client-next instance opens one QUIC connection per leader, so
the number of connections to a pinned leader equals the number of
instances. Previously the only way to open N connections under one
identity was to repeat --staked-identity-file N times, which is
impractical for large N.

Add --clients-per-identity (default 1): total instances become
max(1, num staked identities) * clients-per-identity, and identities are
round-robined across instances via `get(i % num_identities)` (yielding
None / unstaked when no identity file is given).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After the generator finishes there can still be transactions queued in
tpu-client-next's worker channels and quinn send buffers. Tearing the
schedulers down immediately drops them, which is especially visible when
--num-transactions is set. Add a drain phase (--drain-seconds, default
10, 0 to disable) that waits until the aggregated successfully_sent
count is stable across reporter ticks, then a short fixed tail to let
quinn flush, capped by the timeout.

To support this, report_aggregated_stats now also accumulates
non-resetting totals (successfully_sent / congestion_events /
write_error) that the drain phase observes, logs per-tick and final
totals, and breaks down connection_error / write_error by cause.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant