Skip to content

[cuda backend] splitk turboquant sdpa for decode #14122

[cuda backend] splitk turboquant sdpa for decode

[cuda backend] splitk turboquant sdpa for decode #14122

Job Run time
4s
32s
55m 10s
47m 20s
18m 45s
17m 24s
24m 44s
34m 35s
25m 29s
31m 34s
16m 51s
20m 20s
22m 25s
16m 46s
30m 57s
27m 43s
19m 28s
25m 51s
43m 55s
27m 55s
23m 18s
53m 36s
17m 46s
3s
19m 36s
19m 35s
21m 31s
14m 6s
13m 59s
11m 16s
7m 28s
24m 51s
9m 57s
13m 44s
10m 51s
26m 35s
7m 32s
9m 34s
18m 40s
17m 35s
23m 54s
11m 44s
14h 44m 59s