[cuda backend] optimized L_kv threshold for sdpa implementation selection. #13898
| Job | Run time |
|---|---|
| 2s | |
| 29s | |
| 46m 50s | |
| 46m 42s | |
| 19m 23s | |
| 16m 50s | |
| 22m 13s | |
| 21m 31s | |
| 1d 0h 0m 0s | |
| 31m 50s | |
| 34m 29s | |
| 34m 49s | |
| 31m 21s | |
| 22m 55s | |
| 27m 42s | |
| 22m 2s | |
| 18m 24s | |
| 33m 8s | |
| 30m 22s | |
| 28m 48s | |
| 1d 0h 0m 0s | |
| 23m 36s | |
| 28m 28s | |
| 3s | |
| 0s | |
| 0s | |
| 2d 9h 1m 57s |