Skip to content

docs(bench): add M5 max#143

Open
fry69 wants to merge 3 commits into
antirez:mainfrom
fry69:benchmarks
Open

docs(bench): add M5 max#143
fry69 wants to merge 3 commits into
antirez:mainfrom
fry69:benchmarks

Conversation

@fry69
Copy link
Copy Markdown

@fry69 fry69 commented May 14, 2026

This is a benchmark run on my M5 Max 128GB machine, against current main (0610591)

Apart from the standard benchmark settings (which took around 7 minutes), I also included a "4x" run (which took around 40 minutes), with all relevant number multiplied by 4:

./ds4-bench \
  -m ds4flash.gguf \
  --prompt-file speed-bench/promessi_sposi.txt \
  --ctx-start 2048 \
  --ctx-max 262144 \
  --step-incr 8192 \
  --gen-tokens 512

This is IMHO a bit more illuminating than the base benchmark, where the lines stay mostly flat/linear, so it may be possible to read a bit more out of the diagram (correct me if I am wrong on this).

This is mostly for illustration purposes, feel free to close if this is not relevant.

Related PRs #103 (likely better/more comprehensive) and #97


Update:

I added two new runs with a fan control app (Macs Fan Control) setting the fans to "full blast".

This seems indeed to improve the numbers and shorten the time for the 4x long run (down to ~35 minutes from ~45). Inspired by #126

@VadimDu
Copy link
Copy Markdown
Contributor

VadimDu commented May 14, 2026

Thanks!
I was actually expecting a much higher speed-up in prefill tps compared to M4 Max (I remember Apple claims of x3-x4 increase...).
Looks like only 5-10% higher prefill. I guess further optimizations for the new Metal are warranted!

@fry69
Copy link
Copy Markdown
Author

fry69 commented May 14, 2026

I was actually expecting a much higher speed-up in prefill tps compared to M4 Max

The M5 super powers currently do not get fully utilized. Have a look at #15 for M5 speedups, specifically for prefilling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants