The user facing APIs in `cuda.compute` should be annotated using nvtx, so that they show up in Nsight Systems