Remove unnecessary cuda sync for better perf #9383
| Job | Run time |
|---|---|
| 25m 3s | |
| 19m 28s | |
| 21m 40s | |
| 18m 33s | |
| 19m 8s | |
| 22m 35s | |
| 21m 10s | |
| 28m 48s | |
| 21m 21s | |
| 21m 22s | |
| 33m 14s | |
| 27m 56s | |
| 23m 29s | |
| 21m 23s | |
| 23m 15s | |
| 36m 49s | |
| 21m 15s | |
| 28m 14s | |
| 35m 48s | |
| 28m 19s | |
| 33m 41s | |
| 24m 57s | |
| 24m 10s | |
| 23m 57s | |
| 29m 51s | |
| 34m 42s | |
| 25m 16s | |
| 2s | |
| 24m 12s | |
| 25m 23s | |
| 17m 2s | |
| 18m 23s | |
| 31m 1s | |
| 31m 31s | |
| 17m 6s | |
| 31m 22s | |
| 31m 37s | |
| 31m 37s | |
| 16m 58s | |
| 31m 22s | |
| 18m 49s | |
| 17m 29s | |
| 18m 34s | |
| 16m 52s | |
| 17h 54m 44s |