Skip to content

[cuda backend] optimized L_kv threshold for sdpa implementation selection. #13898

[cuda backend] optimized L_kv threshold for sdpa implementation selection.

[cuda backend] optimized L_kv threshold for sdpa implementation selection. #13898

Job Run time
2s
29s
46m 50s
46m 42s
19m 23s
16m 50s
22m 13s
21m 31s
1d 0h 0m 0s
31m 50s
34m 29s
34m 49s
31m 21s
22m 55s
27m 42s
22m 2s
18m 24s
33m 8s
30m 22s
28m 48s
1d 0h 0m 0s
23m 36s
28m 28s
3s
0s
0s
2d 9h 1m 57s