feat: wave-based batch uploads with configurable concurrency by mickvandijke · Pull Request #6 · WithAutonomi/ant-client

mickvandijke · 2026-03-25T20:40:09Z

Summary

Wave-based batch uploads: chunks are grouped into waves of 64, each paid in a single EVM transaction. Storage from wave N is pipelined with quote collection for wave N+1
Concurrency throttling: quote collection, chunk storage, and merkle upload paths are all bounded by chunk_concurrency (defaults to half available CPU threads). Prevents overwhelming peer connections on small networks
Majority-only replication: chunks are sent to CLOSE_GROUP_MAJORITY (3) peers instead of all CLOSE_GROUP_SIZE (5), reducing per-chunk network load
Configurable via CLI: --chunk-concurrency <N> flag to override the auto-detected default

Test plan

cargo clippy --all-targets --all-features -- -D warnings passes
cargo fmt --all -- --check passes
All unit tests pass (97/97)
E2E chunk tests pass (9/9)
E2E data tests pass (6/6)
E2E file tests pass (4/4)
E2E merkle tests pass (2/2)
Manual 53MB file upload to local devnet succeeds

🤖 Generated with Claude Code

Replace per-chunk sequential EVM payments with batched wave-based uploads. Chunks are grouped into waves (up to 64), quotes collected concurrently, then paid in a single wallet.pay_for_quotes() call. Stores from wave N pipeline with quote collection for wave N+1. For a 3-chunk (4MB) upload this reduces payment from 3 separate EVM transactions (~1s with nonce contention) to 1 transaction (~10ms). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Uploads of large files to small networks (e.g. local devnets) were failing because the client blasted all chunks to all peers simultaneously, overwhelming QUIC connections and causing cascading disconnects. Three changes: - Only replicate chunks to CLOSE_GROUP_MAJORITY peers instead of all CLOSE_GROUP_SIZE, reducing per-chunk network load - Throttle both quote collection and chunk storage with buffer_unordered(chunk_concurrency) instead of unbounded parallelism - Make chunk_concurrency configurable via ClientConfig and --chunk-concurrency CLI flag, defaulting to half the available CPU threads Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jacderida

Review of wave-based batch uploads

Nice work overall — the wave-based pipelining design is clean and the code reads well. A few concerns before merging:

1. `chunk_put_to_close_group` now has zero fault tolerance (chunk.rs:73-74)

This is the main concern. Previously, the function sent PUTs to all peers (typically CLOSE_GROUP_SIZE = 5) and returned success as soon as CLOSE_GROUP_MAJORITY (3) responded. This gave tolerance for up to 2 peer failures.

The PR truncates the target list to exactly CLOSE_GROUP_MAJORITY peers:

let target_peers = &peers[..peers.len().min(CLOSE_GROUP_MAJORITY)];

Now if any single peer is slow or unavailable, the entire chunk store fails with InsufficientPeers. On a small testnet or under network churn, this could make uploads significantly more fragile.

This also affects the existing chunk_put code path (not just the new batch path), since chunk_put_to_close_group is shared.

Suggestion: Keep sending to all peers but return early once CLOSE_GROUP_MAJORITY confirm — i.e., revert this change. The network cost of 2 extra concurrent PUTs is marginal compared to the resilience gain.

2. Duplicate chunks are assumed stored without verification (batch.rs:222-226)

In batch_upload_chunks, when a content-address collision is found locally:

if seen_addresses.insert(address) {
    unique_chunks.push(chunk);
} else {
    debug!("Skipping duplicate chunk {}", hex::encode(address));
    all_addresses.push(address);
}

The duplicate's address is added to the result immediately, but the first copy of that chunk hasn't been uploaded yet — it might fail later. The address gets returned regardless. This is probably fine in practice since the caller just counts addresses.len(), but it could be misleading if the return value is ever used to verify what was actually stored.

3. Wave error semantics: one chunk failure aborts the whole upload

If any single chunk in a wave fails during store_paid_chunks, the ? in the collect propagates the error and discards all successfully stored chunks from that wave. The previous per-chunk buffer_unordered approach had the same fail-fast behavior, so this isn't a regression, but worth noting — partial progress from the current wave is lost.

For a future iteration, it might be worth returning a partial result (stored addresses + errors) so callers can retry only failed chunks.

4. Minor: `payment_mode_used` is always `PaymentMode::Single` for batch path

The batch path reports PaymentMode::Single even though it's doing batched EVM transactions. This is technically accurate (not using merkle proofs) but could be confusing when debugging. Consider a PaymentMode::Batch variant, or at least a comment explaining the naming.

5. Minor: doc comment on `chunk_put_to_close_group` is now stale

The existing doc comment (line 55-58 on main) says "Sends the PUT concurrently to all peers". The PR updates it to say "first CLOSE_GROUP_MAJORITY peers", but per point 1 above, if you revert the truncation change, the original doc should be restored.

Summary: The batch upload design and pipelining are solid. The main blocker is point 1 — reducing fault tolerance in chunk_put_to_close_group affects all upload paths and could cause reliability issues. I'd suggest reverting that change and keeping the rest.

mickvandijke and others added 4 commits March 25, 2026 21:27

docs: fix rustdoc links to private items in batch module

ac9a4ab

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: log chunk concurrency at upload start

09ddc36

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mickvandijke changed the title ~~feat: wave-based batch uploads with single EVM payment per wave~~ feat: wave-based batch uploads with configurable concurrency Mar 25, 2026

jacderida reviewed Mar 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: wave-based batch uploads with configurable concurrency#6

feat: wave-based batch uploads with configurable concurrency#6
mickvandijke wants to merge 4 commits intomainfrom
feat-wave-batch-uploads

mickvandijke commented Mar 25, 2026 •

edited

Loading

Uh oh!

jacderida left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mickvandijke commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

jacderida left a comment

Choose a reason for hiding this comment

Review of wave-based batch uploads

1. chunk_put_to_close_group now has zero fault tolerance (chunk.rs:73-74)

2. Duplicate chunks are assumed stored without verification (batch.rs:222-226)

3. Wave error semantics: one chunk failure aborts the whole upload

4. Minor: payment_mode_used is always PaymentMode::Single for batch path

5. Minor: doc comment on chunk_put_to_close_group is now stale

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mickvandijke commented Mar 25, 2026 •

edited

Loading

1. `chunk_put_to_close_group` now has zero fault tolerance (chunk.rs:73-74)

4. Minor: `payment_mode_used` is always `PaymentMode::Single` for batch path

5. Minor: doc comment on `chunk_put_to_close_group` is now stale