[cuda backend] splitk turboquant sdpa for decode #14122
| Job | Run time |
|---|---|
| 4s | |
| 32s | |
| 55m 10s | |
| 47m 20s | |
| 18m 45s | |
| 17m 24s | |
| 24m 44s | |
| 34m 35s | |
| 25m 29s | |
| 31m 34s | |
| 16m 51s | |
| 20m 20s | |
| 22m 25s | |
| 16m 46s | |
| 30m 57s | |
| 27m 43s | |
| 19m 28s | |
| 25m 51s | |
| 43m 55s | |
| 27m 55s | |
| 23m 18s | |
| 53m 36s | |
| 17m 46s | |
| 3s | |
| 19m 36s | |
| 19m 35s | |
| 21m 31s | |
| 14m 6s | |
| 13m 59s | |
| 11m 16s | |
| 7m 28s | |
| 24m 51s | |
| 9m 57s | |
| 13m 44s | |
| 10m 51s | |
| 26m 35s | |
| 7m 32s | |
| 9m 34s | |
| 18m 40s | |
| 17m 35s | |
| 23m 54s | |
| 11m 44s | |
| 14h 44m 59s |