[cuda backend] int4/8 matvec: vectorized activation load (#20144) #1572
| Job | Run time |
|---|---|
| 33s | |
| 44m 4s | |
| 13m 14s | |
| 38m 54s | |
| 9m 24s | |
| 10m 19s | |
| 33m 34s | |
| 41m 52s | |
| 10m 38s | |
| 12m 17s | |
| 10m 46s | |
| 10m 59s | |
| 11m 24s | |
| 11m 48s | |
| 15m 1s | |
| 11m 9s | |
| 10m 44s | |
| 10m 53s | |
| 11m 51s | |
| 11m 43s | |
| 12m 3s | |
| 10m 14s | |
| 10m 26s | |
| 16m 29s | |
| 6h 20m 19s |