feat: add some of the kernels using cuda.compute#3981
Conversation
|
The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR3981 |
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (43.53%) is below the target coverage (98.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files
|
awkward_IndexedArray kernels using cuda.computeawkward_IndexedArray and ByteMaskedArray kernels using cuda.compute
|
The |
awkward_IndexedArray and ByteMaskedArray kernels using cuda.compute
ianna
left a comment
There was a problem hiding this comment.
@maxymnaumchyk - Thanks! 12 more kernels migrated to cuda.compute! I'll enable auto-merge. The benchmarks will be updated after it is merged later today. Thanks.




















Closes #3978, closes #3997, closes #3998, closes #3999, closes #4000, closes #4001, closes #4002, closes #4003, closes #4004, closes #4005, closes #4006, closes #4007
These kernels are only ~2 times faster-->

IndexedArray_reduce_next_nonlocal_nextshifts_64kernel before:IndexedArray_reduce_next_nonlocal_nextshifts_64kernel after:IndexedArray_reduce_next_64kernel before:IndexedArray_reduce_next_64kernel after:IndexedArray_overlay_maskkernel before:IndexedArray_overlay_maskkernel after: