Add RTX 6000 speed by d3y4n · Pull Request #107 · antirez/ds4

d3y4n · 2026-05-12T20:32:08Z

Based on the output of default benchmark cmd.

2048,2048,313.21,128,35.66,52184460
4096,2048,317.81,128,35.12,80373132
8192,2048,317.33,128,34.36,136750476
16384,2048,310.08,128,32.78,249505164
32768,2048,296.95,128,31.72,475014540
65536,2048,273.55,128,29.62,926033292

tao12345666333 · 2026-05-12T22:18:46Z

Could you please share the complete operating steps and information about the machine resources?

This result differs significantly from the speed I tested.🤔

https://x.com/i/status/2054161265577308453

d3y4n · 2026-05-12T22:34:40Z

@tao12345666333 good callout, it did indeed look a bit suspicious, I'm running this on g7e.2xlarge AWS instance at the moment. I will the card in the upcoming weeks in my PC.

This is what I ran:

./ds4-bench \
  -m ds4flash.gguf \
  --prompt-file speed-bench/promessi_sposi.txt \
  --ctx-start 2048 \
  --ctx-max 65536 \
  --step-incr 2048 \
  --gen-tokens 128

luoq · 2026-05-14T16:37:24Z

Another result:

./ds4-bench \
  -m ds4flash.gguf \
  --prompt-file speed-bench/promessi_sposi.txt \
  --ctx-start 2048 \
  --ctx-max 65536 \
  --step-incr 2048 \
  --gen-tokens 128
ds4-bench: context buffers 1311.89 MiB (ctx=65665, backend=cuda, prefill_chunk=2048, raw_kv_rows=2304, compressed_kv_rows=16418)
ds4: CUDA backend initialized on NVIDIA RTX PRO 6000 Blackwell Workstation Edition (sm_120)
ds4: CUDA host registration skipped: operation not supported
ds4: CUDA loading model tensors into device cache: 80.04 GiB
ds4: CUDA startup model cache prepared 80.76 GiB of tensor spans in 19.100s
ds4: cuda backend initialized for graph diagnostics
ctx_tokens,prefill_tokens,prefill_tps,gen_tokens,gen_tps,kvcache_bytes
ds4: CUDA q8 fp16 cache budget exhausted; using q8 kernels (request=8.00 MiB cached=0.00 GiB free=3.42 GiB reserve=4.75 GiB total=94.96 GiB)
2048,2048,375.34,128,44.31,52184460
4096,2048,369.07,128,43.54,80373132
6144,2048,367.39,128,43.25,108561804
8192,2048,366.01,128,42.51,136750476
10240,2048,364.06,128,42.13,164939148
12288,2048,362.11,128,42.02,193127820
14336,2048,359.93,128,41.72,221316492
16384,2048,358.73,128,42.59,249505164
18432,2048,357.38,128,42.37,277693836
20480,2048,356.18,128,42.35,305882508
22528,2048,355.03,128,42.38,334071180
24576,2048,353.84,128,42.21,362259852
26624,2048,352.70,128,41.97,390448524
28672,2048,351.51,128,41.65,418637196
30720,2048,350.41,128,41.33,446825868
32768,2048,349.44,128,38.90,475014540
34816,2048,346.21,128,37.93,503203212
36864,2048,345.57,128,37.68,531391884
38912,2048,344.61,128,37.65,559580556
40960,2048,343.78,128,37.61,587769228
43008,2048,342.95,128,37.50,615957900
45056,2048,342.18,128,37.47,644146572
47104,2048,341.44,128,37.37,672335244
49152,2048,340.76,128,37.31,700523916
51200,2048,339.56,128,37.14,728712588
53248,2048,338.77,128,37.13,756901260
55296,2048,338.09,128,37.05,785089932
57344,2048,337.44,128,36.98,813278604
59392,2048,336.65,128,36.87,841467276
61440,2048,336.07,128,36.64,869655948
63488,2048,335.44,128,36.34,897844620
65536,2048,334.79,128,36.12,926033292

d3y4n · 2026-05-15T19:13:08Z

Thanks @luoq, more in line with what I got, wondering how @tao12345666333 got almost 4x pp 🤔

Add RTX 6000 speed

7da146b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RTX 6000 speed#107

Add RTX 6000 speed#107
d3y4n wants to merge 1 commit into
antirez:mainfrom
d3y4n:feature/speed-cuda

d3y4n commented May 12, 2026

Uh oh!

tao12345666333 commented May 12, 2026

Uh oh!

d3y4n commented May 12, 2026 •

edited

Loading

Uh oh!

luoq commented May 14, 2026

Uh oh!

d3y4n commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

d3y4n commented May 12, 2026

Uh oh!

tao12345666333 commented May 12, 2026

Uh oh!

d3y4n commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luoq commented May 14, 2026

Uh oh!

d3y4n commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

d3y4n commented May 12, 2026 •

edited

Loading