Skip to content

feat(vanilla-epoll): add json-tls profile (HTTP/1.1 over TLS 1.3)#952

Open
enghitalo wants to merge 5 commits into
MDA2AV:mainfrom
enghitalo:feat/vanilla-epoll-json-tls
Open

feat(vanilla-epoll): add json-tls profile (HTTP/1.1 over TLS 1.3)#952
enghitalo wants to merge 5 commits into
MDA2AV:mainfrom
enghitalo:feat/vanilla-epoll-json-tls

Conversation

@enghitalo

Copy link
Copy Markdown
Contributor

What

Wires the json-tls profile for vanilla-epoll: the existing /json/{count}?m={m} endpoint served over HTTP/1.1 + TLS 1.3 on :8081. This was the only gap keeping vanilla off the json-tls board — the JSON path already existed and is correct; this PR adds the TLS transport and declares the test.

The JSON serializer is reused verbatim, so json-tls responses are byte-for-byte identical to the plaintext /json (verified).

How

main.v

  • Extract a transport-agnostic write_json_into(ro, mut out, count, m); the plaintext WorkerCtx method now delegates (no behavior change). Content-Length is precomputed from the same values the body emits → the framed length can never desync (no response-splitting surface).
  • Add a stateless TLS request_handler that serves only /json (404 elsewhere), keeping the TLS port's surface minimal. It captures the read-only ro; no make_state (sidesteps the TLS worker's stateful path).
  • Start a second http_server on TLS_PORT (env, default 8081) with tls_config from /certs/server.{crt,key} (TLS_CERT/TLS_KEY overridable). Fails loud if the cert is present but the key is missing; self-signs only when no cert is mounted.

Dockerfile

  • Build Mbed TLS 4.1.0 from the pinned release tarball; compile with -d vanilla_tls. Runtime carries the shared libs (incl. libtfpsacrypto, the 4.x TF-PSA-Crypto split) + ldconfig. EXPOSE 8081.
  • Thread-safety (load-bearing): enable MBEDTLS_THREADING_C + MBEDTLS_THREADING_PTHREAD. vanilla runs N TLS worker threads, all driving TLS 1.3 handshakes through Mbed TLS's global PSA key store; without the threading mutex the concurrent handshakes race on key slots → a heap-use-after-free under load (ASan: psa_wipe_key_slot frees a slot another thread is mid-memcpy in psa_hmac_setup; crashes ~c1024 with double free or corruption). With it enabled, the load that 100%-crashed now survives clean.

meta.json: add "json-tls" to tests.

Validation

Built the image and ran it with the repo cert bind-mounted at /certs:

  • ✅ All 3 validate.sh json-tls assertions: protocol negotiation (HTTP/1.1, ALPN http/1.1 over TLS 1.3), response schema + total == price*quantity*m on 7:2 / 23:11 / 50:1, Content-Type: application/json.
  • ✅ Byte-for-byte parity: TLS /json output identical to plaintext /json.
  • ✅ Keep-alive (curl num_connects=1, connection reused) — the profile's defining property.
  • ✅ No heap corruption under concurrent TLS load after the threading fix (in-container c1024 ~137k rps, c1536 ok; clean log).

Note: local numbers are from a contended dev box (loopback) and are not leaderboard-comparable — they're stability/correctness signals. A leaderboard run belongs on the dedicated bench host.

🤖 Generated with Claude Code

Wires the json-tls profile: the existing /json/{count}?m={m} endpoint
served over TLS on :8081, the only gap that kept vanilla off the json-tls
board. The JSON serializer is reused verbatim, so responses are byte-for-byte
identical to the plaintext /json (validated).

main.v
- Extract a transport-agnostic `write_json_into(ro, mut out, count, m)` from
  the WorkerCtx serializer; the plaintext method now delegates (no behavior
  change). Content-Length is precomputed from the same values the body emits,
  so the framed length can never desync (no response-splitting surface).
- Add a stateless TLS request_handler that serves ONLY /json (404 elsewhere),
  keeping the TLS port's attack surface minimal. It captures the read-only
  `ro`; no make_state (avoids the TLS worker's stateful path).
- Start a second http_server on TLS_PORT (env, default 8081) with
  tls_config from /certs/server.{crt,key} (TLS_CERT/TLS_KEY overridable);
  fails loud if the cert is present but the key is missing, self-signs only
  when no cert is mounted. Spawned before the blocking plaintext run().

Dockerfile
- Build Mbed TLS 4.1.0 from the pinned release tarball and compile with
  `-d vanilla_tls`; runtime carries the shared libs (incl. libtfpsacrypto,
  the 4.x TF-PSA-Crypto split) + ldconfig. EXPOSE 8081.
- THREAD-SAFETY (load-bearing): enable MBEDTLS_THREADING_C +
  MBEDTLS_THREADING_PTHREAD. vanilla runs N TLS worker threads, all driving
  TLS 1.3 handshakes through Mbed TLS's GLOBAL PSA key store; without the
  threading mutex the concurrent handshakes race on key slots -> a
  heap-use-after-free under load (ASan: psa_wipe_key_slot frees a slot
  another thread is mid-memcpy in psa_hmac_setup; crashes ~c1024 with
  "double free or corruption"). With it, the load that 100%-crashed now
  survives clean (c1024 137k rps in-container, c1536 ok).

meta.json: add "json-tls" to tests.

Validated in-container with the real mounted cert: all 3 validate.sh
json-tls assertions pass (HTTP/1.1 + ALPN http/1.1, item schema +
total==price*quantity*m on 7:2/23:11/50:1, Content-Type application/json),
byte-for-byte parity with plaintext /json, keep-alive, and no heap
corruption under concurrent TLS load.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@enghitalo

Copy link
Copy Markdown
Contributor Author

/benchmark -f vanilla-epoll

@github-actions

Copy link
Copy Markdown
Contributor

👋 /benchmark request received. A collaborator will review and approve the run.

@github-actions

Copy link
Copy Markdown
Contributor

Benchmark Results

Framework: vanilla-epoll | Test: all tests

Test Conn RPS CPU Mem Δ RPS Δ Mem
baseline 512 3,678,687 6490.3% 76MiB +0.5% +10.1%
baseline 4096 4,069,660 6305.8% 155MiB +0.9% +2.0%
pipelined 512 39,509,568 6720.7% 74MiB +0.1% +13.8%
pipelined 4096 39,798,604 6528.6% 165MiB +0.7% +5.1%
limited-conn 512 989,401 3128.7% 67MiB -0.2% +9.8%
limited-conn 4096 1,016,587 3047.2% 102MiB +0.4% +2.0%
json 4096 2,145,461 6381.6% 188MiB +0.8% +10.6%
json-comp 512 2,115,378 6301.5% 92MiB +0.3% +7.0%
json-comp 4096 2,311,838 6417.1% 126MiB +0.1% +18.9%
json-comp 16384 2,325,998 6405.2% 210MiB +0.5% +96.3%
json-tls 4096 512,243 6473.2% 361MiB NEW NEW
upload 32 2,959 1998.3% 133MiB -0.7% +3.9%
upload 256 2,978 3629.4% 321MiB -0.7% +0.6%
api-4 256 64,233 359.8% 112MiB -1.9% +12.0%
api-16 1024 251,729 1481.6% 177MiB +3.6% +7.3%
static 1024 1,083,852 5537.1% 102MiB -2.2% +6.2%
static 4096 1,138,873 5756.5% 256MiB -0.9% +1.2%
static 6800 1,173,478 5584.2% 375MiB -1.3% +0.3%
async-db 1024 290,488 2994.8% 138MiB +2.8% -2.8%
crud 4096 262,358 710.5% 218MiB -14.6% +5.3%
fortunes 1024 165,616 6073.8% 254MiB -1.0% +2.4%
Full log
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   13.90ms   7.54ms   38.30ms   75.70ms   110.50ms

  3988800 requests in 15.00s, 3988672 responses
  Throughput: 265.87K req/s
  Bandwidth:  80.30MB/s
  Status codes: 2xx=3935377, 3xx=0, 4xx=0, 5xx=53295
  Latency samples: 3988671 / 3988672 responses (100.0%)
  Reconnects: 18255
  Per-template: 93725,115411,144147,171216,200507,230203,250969,261470,270429,271569,268691,268133,271168,271107,273036,275435,114350,63058,82062,91985
  Per-template-ok: 91616,113951,142297,169064,197649,227385,247891,258058,267155,268118,265044,264753,267647,267752,269337,272070,114350,61595,80031,89613

  WARNING: 53295/3988672 responses (1.3%) had unexpected status (expected 2xx)
[info] CPU 710.5% | Mem 218MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  200
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   14.82ms   8.44ms   40.00ms   74.40ms   106.00ms

  3807767 requests in 15.00s, 3807767 responses
  Throughput: 253.81K req/s
  Bandwidth:  78.40MB/s
  Status codes: 2xx=3750334, 3xx=0, 4xx=0, 5xx=57433
  Latency samples: 3807764 / 3807767 responses (100.0%)
  Reconnects: 17292
  Per-template: 90080,106386,134856,163809,188948,214440,236016,251572,257041,256362,254532,256457,256123,253999,254565,254437,132621,74979,82638,87903
  Per-template-ok: 87932,104948,132873,161340,186273,211319,232505,247705,253264,252647,250756,252649,252034,250258,250955,250555,132621,73066,80543,86088

  WARNING: 57433/3807767 responses (1.5%) had unexpected status (expected 2xx)
[info] CPU 697.0% | Mem 226MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  200
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   15.51ms   10.70ms   39.50ms   70.90ms   98.90ms

  3672869 requests in 15.00s, 3672869 responses
  Throughput: 244.81K req/s
  Bandwidth:  75.94MB/s
  Status codes: 2xx=3623980, 3xx=0, 4xx=0, 5xx=48889
  Latency samples: 3672860 / 3672869 responses (100.0%)
  Reconnects: 16597
  Per-template: 93612,110412,134200,161623,186085,209180,226020,233905,238552,242464,242185,239532,239848,241319,237328,238118,134533,85468,88965,89511
  Per-template-ok: 91382,109086,132376,159596,183754,206331,223020,230873,235513,239316,239177,236342,236621,238102,234322,235067,134533,83723,87037,87800

  WARNING: 48889/3672869 responses (1.3%) had unexpected status (expected 2xx)
[info] CPU 689.1% | Mem 231MiB

=== Best: 262358 req/s (CPU: 710.5%, Mem: 218MiB) ===
[info] input BW: 22.52MB/s (avg template: 90 bytes)
[info] saved results/crud/4096/vanilla-epoll.json
httparena-bench-vanilla-epoll
httparena-bench-vanilla-epoll

==============================================
=== vanilla-epoll / fortunes / 1024c (tool=gcannon) ===
==============================================
[info] resetting postgres for a clean per-profile baseline
[info] starting postgres sidecar
httparena-postgres
[info] postgres ready (seeded)
[info] waiting for server...
[info] server ready

[run 1/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.80ms   3.98ms   8.63ms   15.10ms   34.00ms

  823383 requests in 5.00s, 823383 responses
  Throughput: 164.61K req/s
  Bandwidth:  3.81GB/s
  Status codes: 2xx=823383, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 823383 / 823383 responses (100.0%)
[info] CPU 5826.6% | Mem 256MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.76ms   4.20ms   8.61ms   13.10ms   17.00ms

  828083 requests in 5.00s, 828083 responses
  Throughput: 165.55K req/s
  Bandwidth:  3.84GB/s
  Status codes: 2xx=828083, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 828081 / 828083 responses (100.0%)
[info] CPU 6073.8% | Mem 254MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.74ms   4.04ms   8.62ms   13.50ms   18.10ms

  820833 requests in 5.00s, 820833 responses
  Throughput: 164.11K req/s
  Bandwidth:  3.80GB/s
  Status codes: 2xx=820833, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 820831 / 820833 responses (100.0%)
[info] CPU 5992.2% | Mem 256MiB

=== Best: 165616 req/s (CPU: 6073.8%, Mem: 254MiB) ===
[info] saved results/fortunes/1024/vanilla-epoll.json
httparena-bench-vanilla-epoll
httparena-bench-vanilla-epoll
[info] skip: vanilla-epoll does not subscribe to baseline-h2
[info] skip: vanilla-epoll does not subscribe to static-h2
[info] skip: vanilla-epoll does not subscribe to baseline-h2c
[info] skip: vanilla-epoll does not subscribe to json-h2c
[info] skip: vanilla-epoll does not subscribe to baseline-h3
[info] skip: vanilla-epoll does not subscribe to static-h3
[info] skip: vanilla-epoll does not subscribe to gateway-64
[info] skip: vanilla-epoll does not subscribe to gateway-h3
[info] skip: vanilla-epoll does not subscribe to production-stack
[info] skip: vanilla-epoll does not subscribe to unary-grpc
[info] skip: vanilla-epoll does not subscribe to unary-grpc-tls
[info] skip: vanilla-epoll does not subscribe to stream-grpc
[info] skip: vanilla-epoll does not subscribe to stream-grpc-tls
[info] skip: vanilla-epoll does not subscribe to echo-ws
[info] skip: vanilla-epoll does not subscribe to echo-ws-pipeline
[info] rebuilding site/data/*.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/frameworks.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/api-16-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/api-4-256.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/async-db-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/crud-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/fortunes-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-16384.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-tls-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-6800.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/upload-256.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/upload-32.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/current.json
[info] done
httparena-postgres
httparena-redis
[info] restoring loopback MTU to 65536

Bump the vanilla lib pin 15bd57e -> 7ca36f6 (perf/ktls-record-offload), which
adds kTLS TX+RX after the Mbed TLS handshake: the kernel does TLS 1.3 record
AES-128-GCM, so the epoll TLS worker's steady-state read/write become plain
recv()/send(). This removes ALL per-record userspace crypto and the per-record
PSA key-store mutex that capped the userspace path. The handshake still runs in
Mbed TLS (amortized by keep-alive); the suite is pinned to TLS_AES_128_GCM_SHA256
and TLS 1.3 tickets are disabled so the kernel record sequence starts at 0.

The 5 changed TLS files are identical at 15bd57e and lib HEAD, so this pin is
exactly "15bd57e + kTLS" — a clean A/B vs the userspace json-tls already in this
PR. Key derivation was cross-checked byte-identical against OpenSSL HKDF; the
crypto_info layout/split/rec_seq verified against linux/tls.h + RFC 8446 §7. On a
host without the `tls` kernel module it falls back cleanly to userspace TLS (the
one-time stderr log prints `[ktls] engaged` or `[ktls] fallback`).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@enghitalo

Copy link
Copy Markdown
Contributor Author

Update: json-tls now runs over kTLS (kernel record offload)

Pushed a follow-up that bumps the vanilla lib pin 15bd57e → 7ca36f6 (perf/ktls-record-offload). After the Mbed TLS handshake the connection hands record crypto to the kernel (setsockopt TCP_ULP "tls" + TLS_TX/TLS_RX, AES-128-GCM), so the epoll TLS worker's steady-state read/write become plain recv()/send().

Why: the prior userspace run landed at 512k rps (~76% TLS tax). Root cause: Mbed TLS 4's PSA crypto uses a process-global key store, and MBEDTLS_THREADING_C (required for vanilla's N TLS worker threads) wraps it in a mutex that serializes every record across workers. kTLS removes all per-record userspace crypto — and that mutex — from the hot path; the handshake still uses Mbed TLS (amortized by keep-alive). The suite is pinned to TLS_AES_128_GCM_SHA256 and TLS 1.3 tickets disabled so the kernel record sequence starts at 0.

Clean A/B: the 5 changed TLS files are byte-identical at 15bd57e and lib HEAD, so this pin is exactly 15bd57e + kTLS.

Local validation (this dev box has no tls kernel module + no sudo, so kTLS can't engage here — only the bench host can):

  • HKDF-Expand-Label key derivation cross-checked byte-identical against openssl kdf HKDF; the tls12_crypto_info_aes_gcm_128 layout / IV-salt split / direction (server→TX, client→RX) / rec_seq=0 verified against linux/tls.h + RFC 8446 §7.
  • Integration + clean fallback: handshake → try kTLS → (TCP_ULP ENOENT here) → userspace Mbed TLS → correct responses (3 validate pairs pass, ALPN http/1.1, cipher pinned to AES-128-GCM, c512 stable). So json-tls never breaks on a host without the module.

A one-time stderr line logs [ktls] engaged: kernel TLS TX+RX (or [ktls] fallback: ...) per process — on the bench host it should read engaged. The benchmark there is what validates the perf (target: top the json-tls board, currently zix @ 1.90M).

🤖 Generated with Claude Code

@enghitalo

Copy link
Copy Markdown
Contributor Author

/benchmark -f vanilla-epoll

@github-actions

Copy link
Copy Markdown
Contributor

👋 /benchmark request received. A collaborator will review and approve the run.

@github-actions

Copy link
Copy Markdown
Contributor

Benchmark Results

Framework: vanilla-epoll | Test: all tests

Test Conn RPS CPU Mem Δ RPS Δ Mem
baseline 512 3,717,266 6390.5% 78MiB +1.5% +13.0%
baseline 4096 4,104,461 6271.3% 154MiB +1.8% +1.3%
pipelined 512 39,474,505 6751.9% 73MiB ~0% +12.3%
pipelined 4096 39,995,514 6507.9% 165MiB +1.2% +5.1%
limited-conn 512 990,708 3167.0% 69MiB ~0% +13.1%
limited-conn 4096 1,011,447 2935.5% 72MiB -0.1% -28.0%
json 4096 2,151,445 6250.7% 203MiB +1.1% +19.4%
json-comp 512 2,081,771 6172.5% 79MiB -1.3% -8.1%
json-comp 4096 2,334,135 6252.9% 110MiB +1.1% +3.8%
json-comp 16384 2,300,120 6404.8% 180MiB -0.6% +68.2%
json-tls 4096 1,502,783 6163.6% 398MiB NEW NEW
upload 32 2,956 2069.9% 134MiB -0.8% +4.7%
upload 256 3,013 3808.1% 308MiB +0.5% -3.4%
api-4 256 66,079 373.2% 113MiB +0.9% +13.0%
api-16 1024 238,595 1324.4% 167MiB -1.8% +1.2%
static 1024 1,100,728 5560.9% 106MiB -0.7% +10.4%
static 4096 1,150,443 5659.8% 260MiB ~0% +2.8%
static 6800 1,178,578 5588.2% 378MiB -0.9% +1.1%
async-db 1024 306,191 3198.2% 151MiB +8.4% +6.3%
crud 4096 323,510 806.0% 217MiB +5.3% +4.8%
fortunes 1024 167,008 6100.4% 255MiB -0.2% +2.8%
Full log
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   10.85ms    708us   34.00ms   75.50ms   115.60ms

  4926914 requests in 15.00s, 4926722 responses
  Throughput: 328.39K req/s
  Bandwidth:  98.78MB/s
  Status codes: 2xx=4852650, 3xx=0, 4xx=0, 5xx=74072
  Latency samples: 4926721 / 4926722 responses (100.0%)
  Reconnects: 23129
  Per-template: 96792,120284,155060,189487,221429,253234,283507,311852,335286,352369,363437,368840,370501,374426,379488,378921,131018,61955,82592,96243
  Per-template-ok: 93021,118751,152874,186865,218479,249696,279861,307524,330583,347365,358363,363722,365192,369225,374250,373669,131018,59722,79706,92763

  WARNING: 74072/4926722 responses (1.5%) had unexpected status (expected 2xx)
[info] CPU 806.0% | Mem 217MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  200
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   11.95ms   4.55ms   34.60ms   66.90ms   97.00ms

  4568160 requests in 15.00s, 4567904 responses
  Throughput: 304.48K req/s
  Bandwidth:  93.22MB/s
  Status codes: 2xx=4490647, 3xx=0, 4xx=0, 5xx=77257
  Latency samples: 4567898 / 4567904 responses (100.0%)
  Reconnects: 21245
  Per-template: 102421,120799,153065,181737,210038,239176,264726,289061,309412,319991,324092,327267,327178,327148,327395,330408,147672,78297,88421,99594
  Per-template-ok: 99063,118980,150664,178948,206787,235299,260436,284227,304475,314718,318938,321821,321700,321660,322173,324799,147672,75813,86031,96437

  WARNING: 77257/4567904 responses (1.7%) had unexpected status (expected 2xx)
[info] CPU 795.3% | Mem 227MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  200
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   12.42ms   7.01ms   33.80ms   62.60ms   88.50ms

  4440156 requests in 15.00s, 4440156 responses
  Throughput: 295.96K req/s
  Bandwidth:  91.58MB/s
  Status codes: 2xx=4377703, 3xx=0, 4xx=0, 5xx=62453
  Latency samples: 4440150 / 4440156 responses (100.0%)
  Reconnects: 20510
  Per-template: 105983,124506,153646,179375,207168,233566,259154,282222,298714,305611,307019,306578,309294,311297,310362,309852,155547,85251,92382,102622
  Per-template-ok: 103123,123071,151677,176967,204496,230426,255683,278368,294511,301349,302648,302342,304895,306809,306038,305373,155547,83433,90404,100540

  WARNING: 62453/4440156 responses (1.4%) had unexpected status (expected 2xx)
[info] CPU 789.1% | Mem 231MiB

=== Best: 323510 req/s (CPU: 806.0%, Mem: 217MiB) ===
[info] input BW: 27.77MB/s (avg template: 90 bytes)
[info] saved results/crud/4096/vanilla-epoll.json
httparena-bench-vanilla-epoll
httparena-bench-vanilla-epoll

==============================================
=== vanilla-epoll / fortunes / 1024c (tool=gcannon) ===
==============================================
[info] resetting postgres for a clean per-profile baseline
[info] starting postgres sidecar
httparena-postgres
[info] postgres ready (seeded)
[info] waiting for server...
[info] server ready

[run 1/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.86ms   4.07ms   8.81ms   14.50ms   29.80ms

  824662 requests in 5.00s, 824662 responses
  Throughput: 164.87K req/s
  Bandwidth:  3.82GB/s
  Status codes: 2xx=824662, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 824660 / 824662 responses (100.0%)
[info] CPU 5774.5% | Mem 257MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.71ms   4.09ms   8.56ms   13.10ms   17.10ms

  835043 requests in 5.00s, 835043 responses
  Throughput: 166.95K req/s
  Bandwidth:  3.87GB/s
  Status codes: 2xx=835043, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 835039 / 835043 responses (100.0%)
[info] CPU 6100.4% | Mem 255MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.66ms   4.07ms   8.36ms   12.60ms   16.10ms

  828026 requests in 5.00s, 828026 responses
  Throughput: 165.54K req/s
  Bandwidth:  3.84GB/s
  Status codes: 2xx=828026, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 828022 / 828026 responses (100.0%)
[info] CPU 6018.0% | Mem 256MiB

=== Best: 167008 req/s (CPU: 6100.4%, Mem: 255MiB) ===
[info] saved results/fortunes/1024/vanilla-epoll.json
httparena-bench-vanilla-epoll
httparena-bench-vanilla-epoll
[info] skip: vanilla-epoll does not subscribe to baseline-h2
[info] skip: vanilla-epoll does not subscribe to static-h2
[info] skip: vanilla-epoll does not subscribe to baseline-h2c
[info] skip: vanilla-epoll does not subscribe to json-h2c
[info] skip: vanilla-epoll does not subscribe to baseline-h3
[info] skip: vanilla-epoll does not subscribe to static-h3
[info] skip: vanilla-epoll does not subscribe to gateway-64
[info] skip: vanilla-epoll does not subscribe to gateway-h3
[info] skip: vanilla-epoll does not subscribe to production-stack
[info] skip: vanilla-epoll does not subscribe to unary-grpc
[info] skip: vanilla-epoll does not subscribe to unary-grpc-tls
[info] skip: vanilla-epoll does not subscribe to stream-grpc
[info] skip: vanilla-epoll does not subscribe to stream-grpc-tls
[info] skip: vanilla-epoll does not subscribe to echo-ws
[info] skip: vanilla-epoll does not subscribe to echo-ws-pipeline
[info] rebuilding site/data/*.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/frameworks.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/api-16-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/api-4-256.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/async-db-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/crud-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/fortunes-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-16384.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-tls-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-6800.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/upload-256.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/upload-32.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/current.json
[info] done
httparena-postgres
httparena-redis
[info] restoring loopback MTU to 65536

@enghitalo

Copy link
Copy Markdown
Contributor Author

kTLS result: json-tls 512k → 1,502,783 rps (2.93×)

The kTLS pin engaged on the bench host — json-tls jumped from the userspace 512,243 to 1,502,783 rps, cutting the TLS tax from ~76% to ~30% (plain json is 2.15M) at slightly lower CPU (6163% vs 6473%). That lands vanilla-epoll in the json-tls top tier (just under swerver's 1.52M), up from mid-pack.

Also note async-db +8.4% / crud +5.3% this run — the crud -14.6% in the userspace run was just variance (the TLS listener is idle during non-TLS profiles).

This PR is now a solid, mergeable result: correct json-tls (validate passes), fast via kTLS where the tls module is present, and a clean userspace fallback where it isn't (so it never breaks).

Still ~21% behind the io_uring leaders (zix 1.90M / sark 1.83M / ioxide 1.79M). The remaining gap is structural, not crypto: epoll's single-acceptor + per-edge syscalls vs io_uring multishot/batched, plus the 4096-handshake startup storm still going through Mbed TLS's global PSA mutex. Closing it (SO_REUSEPORT per-worker accept to parallelize the handshake storm + accept) is a separate follow-up.

🤖 Generated with Claude Code

enghitalo and others added 2 commits June 29, 2026 09:20
Behavior-neutral cleanup of the kTLS lib commit: extract a ktls_recv helper
(symmetry with ktls_send) and use a single goto-cleanup in vtls_enable_ktls
that scrubs all userspace key material on every path. Same kTLS behavior that
benchmarked at 1.50M rps json-tls; re-pinned so the benchmarked code == the
final reviewed code.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…2AV#79)

kTLS record offload landed on the vanilla lib main via enghitalo/vanilla#79,
so re-point the pin from the (now-merged) feature commit to main @5137a9a.
Same code that benchmarked json-tls at 1.50M rps (2.93x over userspace).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@enghitalo

Copy link
Copy Markdown
Contributor Author

/benchmark -f vanilla-epoll --save

@github-actions

Copy link
Copy Markdown
Contributor

👋 /benchmark request received. A collaborator will review and approve the run.

@github-actions

Copy link
Copy Markdown
Contributor

Benchmark Results

Framework: vanilla-epoll | Test: all tests

Test Conn RPS CPU Mem Δ RPS Δ Mem
baseline 512 3,669,090 6375.2% 77MiB +0.2% +11.6%
baseline 4096 4,063,080 6302.1% 154MiB +0.7% +1.3%
pipelined 512 39,770,883 6684.2% 74MiB +0.8% +13.8%
pipelined 4096 39,526,902 6707.0% 163MiB ~0% +3.8%
limited-conn 512 1,004,965 3206.9% 67MiB +1.4% +9.8%
limited-conn 4096 990,370 3029.1% 108MiB -2.2% +8.0%
json 4096 2,142,316 6408.6% 178MiB +0.6% +4.7%
json-comp 512 2,111,855 6181.6% 79MiB +0.2% -8.1%
json-comp 4096 2,334,382 6209.6% 144MiB +1.1% +35.8%
json-comp 16384 2,318,394 6059.6% 101MiB +0.2% -5.6%
json-tls 4096 1,506,349 6121.2% 398MiB NEW NEW
upload 32 2,968 2046.9% 132MiB -0.4% +3.1%
upload 256 3,004 3662.1% 332MiB +0.2% +4.1%
api-4 256 68,078 374.1% 114MiB +3.9% +14.0%
api-16 1024 245,137 1406.9% 180MiB +0.9% +9.1%
static 1024 1,099,156 5539.4% 106MiB -0.8% +10.4%
static 4096 1,140,750 5635.8% 259MiB -0.8% +2.4%
static 6800 1,175,210 5607.6% 375MiB -1.2% +0.3%
async-db 1024 291,066 3114.4% 151MiB +3.0% +6.3%
crud 4096 314,523 790.0% 217MiB +2.4% +4.8%
fortunes 1024 165,704 6150.7% 255MiB -1.0% +2.8%
Full log
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   11.34ms    865us   34.50ms   74.30ms   114.30ms

  4784488 requests in 15.00s, 4784488 responses
  Throughput: 318.92K req/s
  Bandwidth:  96.32MB/s
  Status codes: 2xx=4717853, 3xx=0, 4xx=0, 5xx=66635
  Latency samples: 4784488 / 4784488 responses (100.0%)
  Reconnects: 22374
  Per-template: 96025,120640,154396,187299,218286,249281,279136,305466,328277,340082,348815,353898,356127,356493,358632,359860,131405,61828,82449,96093
  Per-template-ok: 92517,119099,152437,184798,215553,246199,275511,301591,324093,335675,344342,349561,351611,351808,354028,355377,131405,59744,80001,92503

  WARNING: 66635/4784488 responses (1.4%) had unexpected status (expected 2xx)
[info] CPU 790.0% | Mem 217MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  200
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   12.15ms   5.67ms   34.70ms   66.00ms   94.80ms

  4513676 requests in 15.00s, 4513676 responses
  Throughput: 300.86K req/s
  Bandwidth:  92.11MB/s
  Status codes: 2xx=4438004, 3xx=0, 4xx=0, 5xx=75672
  Latency samples: 4513667 / 4513676 responses (100.0%)
  Reconnects: 20915
  Per-template: 103148,121862,154060,181845,210778,239444,265108,286087,300364,310963,318123,319590,319049,319483,323460,325890,146861,78159,89903,99490
  Per-template-ok: 99818,120210,151547,178778,207364,235658,260799,281430,295541,305759,312954,314303,313734,314559,318113,320673,146861,76066,87432,96395

  WARNING: 75672/4513676 responses (1.7%) had unexpected status (expected 2xx)
[info] CPU 773.3% | Mem 225MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  200
  Templates: 20
  Expected:  200
  Duration:  15s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   12.97ms   7.85ms   34.20ms   62.30ms   87.30ms

  4268484 requests in 15.00s, 4268484 responses
  Throughput: 284.51K req/s
  Bandwidth:  87.95MB/s
  Status codes: 2xx=4213451, 3xx=0, 4xx=0, 5xx=55033
  Latency samples: 4268481 / 4268484 responses (100.0%)
  Reconnects: 19619
  Per-template: 105184,123132,149228,175610,203815,228656,252937,274007,284328,290459,289093,292301,291896,292401,291983,289069,150160,88851,92979,102392
  Per-template-ok: 103102,121666,147279,173215,201315,225831,249592,270572,280777,286826,285412,288466,288450,288739,288245,285329,150160,87108,91065,100299

  WARNING: 55033/4268484 responses (1.3%) had unexpected status (expected 2xx)
[info] CPU 762.0% | Mem 232MiB

=== Best: 314523 req/s (CPU: 790.0%, Mem: 217MiB) ===
[info] input BW: 27.00MB/s (avg template: 90 bytes)
[info] saved results/crud/4096/vanilla-epoll.json
httparena-bench-vanilla-epoll
httparena-bench-vanilla-epoll

==============================================
=== vanilla-epoll / fortunes / 1024c (tool=gcannon) ===
==============================================
[info] resetting postgres for a clean per-profile baseline
[info] starting postgres sidecar
httparena-postgres
[info] postgres ready (seeded)
[info] waiting for server...
[info] server ready

[run 1/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.84ms   4.04ms   8.67ms   15.80ms   34.90ms

  824128 requests in 5.00s, 824128 responses
  Throughput: 164.76K req/s
  Bandwidth:  3.82GB/s
  Status codes: 2xx=824128, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 824128 / 824128 responses (100.0%)
[info] CPU 5880.8% | Mem 258MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.68ms   3.99ms   8.66ms   13.10ms   17.40ms

  828520 requests in 5.00s, 828520 responses
  Throughput: 165.64K req/s
  Bandwidth:  3.84GB/s
  Status codes: 2xx=828520, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 828514 / 828520 responses (100.0%)
[info] CPU 6150.7% | Mem 255MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/fortunes
  Threads:   64
  Conns:     1024 (16/thread)
  Pipeline:  1
  Req/conn:  unlimited (keep-alive)
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   4.74ms   4.18ms   8.62ms   13.20ms   17.50ms

  822357 requests in 5.00s, 822357 responses
  Throughput: 164.41K req/s
  Bandwidth:  3.81GB/s
  Status codes: 2xx=822357, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 822352 / 822357 responses (100.0%)
[info] CPU 6024.3% | Mem 256MiB

=== Best: 165704 req/s (CPU: 6150.7%, Mem: 255MiB) ===
[info] saved results/fortunes/1024/vanilla-epoll.json
httparena-bench-vanilla-epoll
httparena-bench-vanilla-epoll
[info] skip: vanilla-epoll does not subscribe to baseline-h2
[info] skip: vanilla-epoll does not subscribe to static-h2
[info] skip: vanilla-epoll does not subscribe to baseline-h2c
[info] skip: vanilla-epoll does not subscribe to json-h2c
[info] skip: vanilla-epoll does not subscribe to baseline-h3
[info] skip: vanilla-epoll does not subscribe to static-h3
[info] skip: vanilla-epoll does not subscribe to gateway-64
[info] skip: vanilla-epoll does not subscribe to gateway-h3
[info] skip: vanilla-epoll does not subscribe to production-stack
[info] skip: vanilla-epoll does not subscribe to unary-grpc
[info] skip: vanilla-epoll does not subscribe to unary-grpc-tls
[info] skip: vanilla-epoll does not subscribe to stream-grpc
[info] skip: vanilla-epoll does not subscribe to stream-grpc-tls
[info] skip: vanilla-epoll does not subscribe to echo-ws
[info] skip: vanilla-epoll does not subscribe to echo-ws-pipeline
[info] rebuilding site/data/*.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/frameworks.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/api-16-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/api-4-256.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/async-db-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/crud-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/fortunes-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-16384.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-comp-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-tls-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-1024.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/static-6800.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/upload-256.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/upload-32.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/current.json
[info] done
httparena-postgres
httparena-redis
[info] restoring loopback MTU to 65536

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant