https://github.com/tally-project/tally/commit/d4ce54fc5f87d637b6cf464ac0767eed7b5076df This addresses the issue that when running pytorch compile model, some kernels throw invalid resource error when launched. Question is why? CUDA context is initialized in this thread already. Why do we need to create a stream here?
d4ce54f
This addresses the issue that when running pytorch compile model, some kernels throw invalid resource error when launched.
Question is why?
CUDA context is initialized in this thread already. Why do we need to create a stream here?