Skip to content

Commit 3e599f2

Browse files
therealalephclaude
andcommitted
chore(release): v1.9.18 — perf: zero-copy mux + base64 off mux thread (#881)
Bumps Cargo.toml v1.9.17 → v1.9.18 and ships the changelog for the zero-copy mux refactor merged in 54552bb. No user-visible behavior change; perf-focused release. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 54552bb commit 3e599f2

3 files changed

Lines changed: 20 additions & 2 deletions

File tree

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "mhrv-rs"
3-
version = "1.9.17"
3+
version = "1.9.18"
44
edition = "2021"
55
description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting"
66
license = "MIT"

docs/changelog/v1.9.18.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
<!-- see docs/changelog/v1.1.0.md for the file format: Persian, then `---`, then English. -->
2+
• Performance refactor of full-tunnel mux hot path ([#881](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/881) by @dazzling-no-more) — zero-copy reads via `Bytes`/`BytesMut` و base64 encoding از روی single mux thread برداشته شد. هیچ wire-protocol change نداره — فقط internal data flow. (1) `tunnel_loop` و SOCKS5 UDP receive loop دیگه per-iteration `Vec::to_vec()` copy ندارن. `MuxMsg::{ConnectData,Data,UdpOpen,UdpData}` حالا `Bytes` (Arc-backed) carry می‌کنن به جای `Vec<u8>`/`Arc<Vec<u8>>`. TCP path threshold-based: ≥32 KB → `BytesMut::split().freeze()` (saves 64 KB memcpy on hot downloads); <32 KB → `Bytes::copy_from_slice` + `buf.clear()` (payload-sized retention). UDP path: fixed `Vec<u8>` recv buffer + size-guarded copy. (2) base64 encoding (تا ~3 MB per batch) از mux thread رفت به spawned task تو `fire_batch` بعد از per-deployment semaphore — single mux task دیگه serialize نمی‌شه. (3) Code quality: `BatchAccum::push_or_fire` (۴ match arm به ۱ کلپس)، `should_fire()` predicate با `saturating_add`، `encode_pending()` free function. ۲۰۰ → **۲۰۸ lib test** (+۸ regression: encode_pending × ۴، should_fire × ۳، batch_accum_reindexes_after_flush). API change: `TunnelMux::udp_open`/`udp_data` حالا `impl Into<Bytes>` می‌گیرن — existing callers با Vec<u8>/Bytes/BytesMut بدون تغییر کار می‌کنن.
3+
---
4+
• Performance refactor of the full-tunnel mux hot data path ([#881](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/881) by @dazzling-no-more). No wire-protocol changes — internal data flow only.
5+
6+
**1. Zero-copy reads via `Bytes`/`BytesMut`.** `tunnel_loop` and the SOCKS5 UDP receive loop drop per-iteration `Vec::to_vec()` copies. `MuxMsg::{ConnectData,Data,UdpOpen,UdpData}` now carry `Bytes` (Arc-backed internally) instead of `Vec<u8>`/`Arc<Vec<u8>>`; the `Arc::try_unwrap` dance for `pending_client_data` is gone. TCP path is threshold-based to avoid memory regressions:
7+
- **n ≥ 32 KB**: `BytesMut::split().freeze()` — saves the 64 KB memcpy on hot downloads.
8+
- **n < 32 KB**: `Bytes::copy_from_slice` + `buf.clear()` — payload-sized retention. Without this split, `bytes` 1.x's whole-allocation refcount would pin a full 64 KB per queued tiny read under semaphore stall (worst case ~96 MB on a backpressured tunnel).
9+
10+
UDP path: fixed `Vec<u8>` recv buffer + `Bytes::copy_from_slice` after the 9 KB `MAX_UDP_PAYLOAD_BYTES` guard. `parse_socks5_udp_packet` split into `_offsets` + `&[u8]` wrapper so callers stay on the reusable buffer.
11+
12+
**2. Base64 encoding moved off the single mux thread.** New internal `PendingOp { data: Option<Bytes>, encode_empty: bool }` flows through `mux_loop` with raw bytes. Actual `B64.encode(...)` runs in `fire_batch`'s spawned task, after the per-deployment semaphore. Up to ~3 MB of encoding per batch (50 ops × 64 KB) no longer serializes the single mux task.
13+
14+
**3. Code quality (drive-bys).** `BatchAccum::push_or_fire` collapses 4× ~25-line match arms into ~10 lines each. `should_fire(pending_len, payload_bytes, op_bytes)` predicate extracted with `saturating_add`. `encode_pending(p) -> BatchOp` extracted as a free function for direct test coverage.
15+
16+
**Public API change**: `TunnelMux::udp_open` and `udp_data` now take `data: impl Into<Bytes>` instead of `Vec<u8>` — existing in-tree callers passing `Vec<u8>`, `&'static [u8]`, `Bytes`, or `BytesMut` all keep compiling.
17+
18+
200 → **208 lib tests** (+8 regression: `encode_pending_*` × 4, `should_fire_*` × 3, `batch_accum_reindexes_after_flush`).

0 commit comments

Comments
 (0)