Skip to content

feat: complete CPU kernel migration from parents to offsets (#3988)#4056

Open
ianna wants to merge 4 commits into
mainfrom
awkward3
Open

feat: complete CPU kernel migration from parents to offsets (#3988)#4056
ianna wants to merge 4 commits into
mainfrom
awkward3

Conversation

@ianna
Copy link
Copy Markdown
Member

@ianna ianna commented May 16, 2026

back port from awkward3 branch

* next step in kernel migration from parents to offsets

* linter fix

* add jax bincount

* add cpp kernel to convert parents to offsets

* fix typo

* fix typo in an auto-generated test

* almost there

* nearly there

* cleanup

* fix remaining kernels

* final cleanup

* format json data

* format

* initialize maxnextparents  = -1 (a sentinel meaning "no bin was touched")

* update the kernels to work on offsets!

* format

* migrate cupy rawkernels from parents to offsets

* migrate jax reducers

* fix for platfroms where the int64 counts can't be safely cast

* add bincount for cupy backend

* compact loc

* fix windows build and add optional OpenMP support

* update cuda kernels

* feat: add `missing_repeat` kernel implementation using cuda.compute (#3922)

* feat: add missing_repeat implementation using cuda.compute

* keep the same name as in cpu kernel

* add tests for repetitions>1 and regularsize>1

* style: pre-commit fixes

* add keywords

* style fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* feat: add `index_rpad_and_clip` kernels implementations using cuda.compute (#3923)

* feat: add `index_rpad_and_clip` kernels implementation using cuda.compute

* style fix

* add a test for `index_rpad_and_clip_axis0` that would have target>length

* add keyword names

* style

* Apply suggestions from Ianna

Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch>

* style: pre-commit fixes

---------

Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fix: array attrs not being validated at creation and being of inconsistent type (#3996)

* convert dict attrs to Attrs type

* add test

* fix: ensure jax backend uses arrays on cpu only (#3990)

* ensure jax uses cpu by default

* remove now redundant jax_platform_name setting

* change error messages

* better errors

* remove DeviceArray mentions as that does not exist while we're at it too

* almost there

* cleanup

* cleanup

* remove old data file

* update cupy kernels

* remove exlicit test for depricated kernel

* add complex kernels and port them to use offsets

* fix complex reducers

* add remaining kernels

* add kernels for complex and bool sum

* move to segmented_reduce

* fix typo

* promote type

* fix complex bool reducer

* try another algo

* missed one

* use type inference

* make numba happy

* remove reducer overloads

* use sum op func for complex types

* remove test of depricated code

* avoid using == to compare floating-point products

* fix boundary tests

* cleanup

* cleanup

* remove dead code

* revert lexsort to argmin/max

* handrolled lexsort

* remove dead code

* remove bincount

* import numba cuda for jit

* fix tests

---------

Co-authored-by: maxymnaumchyk <maxymnaumchyk@gmail.com>
Co-authored-by: maxymnaumchyk <70752300+maxymnaumchyk@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Iason Krommydas <iason.krom@gmail.com>
@ianna ianna requested a review from ikrommyd May 16, 2026 19:57
Copy link
Copy Markdown
Member Author

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ikrommyd - please take your time to review this. Thanks!

@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

❌ Patch coverage is 96.48241% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.64%. Comparing base (35f3e29) to head (496e74f).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/awkward/contents/listoffsetarray.py 92.10% 3 Missing ⚠️
src/awkward/contents/unionarray.py 60.00% 2 Missing ⚠️
src/awkward/_connect/jax/reducers.py 95.65% 1 Missing ⚠️
src/awkward/_reducers.py 96.15% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/awkward/_backends/cupy.py 100.00% <100.00%> (+4.05%) ⬆️
src/awkward/_connect/cuda/_compute.py 39.35% <ø> (-21.86%) ⬇️
src/awkward/_connect/cuda/reducers.py 61.11% <100.00%> (-19.89%) ⬇️
src/awkward/_do.py 84.14% <100.00%> (-0.56%) ⬇️
src/awkward/contents/bitmaskedarray.py 72.25% <100.00%> (ø)
src/awkward/contents/bytemaskedarray.py 88.61% <100.00%> (-0.03%) ⬇️
src/awkward/contents/content.py 77.45% <100.00%> (ø)
src/awkward/contents/emptyarray.py 78.28% <100.00%> (ø)
src/awkward/contents/indexedarray.py 86.56% <100.00%> (+1.20%) ⬆️
src/awkward/contents/indexedoptionarray.py 89.95% <100.00%> (ø)
... and 10 more

... and 4 files with indirect coverage changes

@ikrommyd
Copy link
Copy Markdown
Collaborator

@ariostas should definitely take a look too (when he's back to wor). I do not fully trust myself with c++ yet but can definitely review the python part changes.

@ikrommyd
Copy link
Copy Markdown
Collaborator

One question I have is how are the kernel test data generated? I'm not familiar with this part.
Are they generated from running the kernels? i.e if you have a bug in a kernel, will it propagate to the test data jsons? Or we can trust the test data for kernel correctness?

@ianna
Copy link
Copy Markdown
Member Author

ianna commented May 16, 2026

One question I have is how are the kernel test data generated? I'm not familiar with this part. Are they generated from running the kernels? i.e if you have a bug in a kernel, will it propagate to the test data jsons? Or we can trust the test data for kernel correctness?

yes, we can trust the data because these are the data from the tests.

@ikrommyd
Copy link
Copy Markdown
Collaborator

I've ran this against HiggsDNA and getting the same results for my analysis. Only small floating point differences that np.isclose coverts (but array_equal errors). I'm pretty sure the kernel changes can cause such numerical differeneces.

@ianna
Copy link
Copy Markdown
Member Author

ianna commented May 17, 2026

I've ran this against HiggsDNA and getting the same results for my analysis. Only small floating point differences that np.isclose coverts (but array_equal errors). I'm pretty sure the kernel changes can cause such numerical differeneces.

Thanks @ikrommyd ! Floating point comparison is indeed a tricky one.

@ikrommyd
Copy link
Copy Markdown
Collaborator

Yeah totally reasonable. Floating point ops are not commutative.

Anyways I'll test this out a bit more with analyses examples and I also wanna if there's any noticeable performance hit.
Then I'll actually read the code.

@ianna
Copy link
Copy Markdown
Member Author

ianna commented May 17, 2026

Yeah totally reasonable. Floating point ops are not commutative.

Anyways I'll test this out a bit more with analyses examples and I also wanna see if there's any noticeable performance hit. Then I'll actually read the code.

Thanks! Don’t worry about the performance yet — that’s the next on my list. First we must insure that the results are correct.👍

@github-actions
Copy link
Copy Markdown

The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR4056

ianna and others added 3 commits May 18, 2026 23:50
* feat: add `awkward_IndexedArray_overlay_mask` kernel using cuda.compute

* feat: add `awkward_IndexedArray_reduce_next_64` kernel using cuda.compute

* feat: add `IndexedArray_reduce_next_nonlocal_nextshifts_64` kernel using cuda.compute

* feat: add `ByteMaskedArray_getitem_nextcarry` kernel using cuda.compute

* feat: add `awkward_ByteMaskedArray_numnull` kernel using cuda.compute

* feat: add `awkward_RegularArray_getitem_jagged_expand` kernel using cuda.compute

* add an upper bound

* style: pre-commit fixes

* feat: add `awkward_RegularArray_getitem_jagged_expand` kernel using cuda.compute

* feat: add `awkward_UnionArray_simplify_one` kernel using cuda.compute

* feat: add `awkward_ListArray_broadcast_tooffsets` kernel using cuda.compute

* feat: add `awkward_ListArray_localindex` kernel using cuda.compute

* fix the impl

* feat: add `awkward_ListArray_compact_offsets` kernel using cuda.compute

* feat: add `awkward_ListArray_combinations_length` kernel using cuda.compute

* feat: add `awkward_ListArray_combinations` kernel using cuda.compute

* feat: add `awkward_UnionArray_nestedfill_tags_index` kernel using cuda.compute

* fix the tests for kernels that are deliberately raising errors

* compare `starts` and `stops` separately

* ignore `memptr` argument for pylint

* style: pre-commit fixes

* Add functions for indexing and repeating arrays

* style: pre-commit fixes

* return unary_transform call for segment_ids

* style: pre-commit fixes

* update the `awkward_IndexedArray_reduce_next_64` to work with offsets

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch>
* feat: add reimplementation of the `awkward_RegularArray` kernels using cuda.compute

* reimplement more kernels using cuda.compute

* reimplement more kernels using cuda.compute

* style: pre-commit fixes

* style fix

* use the named functions instead of lambdas

* add new kernels

* add new kernels

* fix a type mismatch

* style: pre-commit fixes

* add new kernels

* fix the overflow error

* new batch of kernels (draft)

* add comments and cleanup

* fix the tests messages

* add new `BitMaskedArray` kernels

* modify the `test_1381_check_errors.py` test

* resolve the conflicts

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@ikrommyd ikrommyd requested a review from ariostas May 28, 2026 03:12
@ariostas
Copy link
Copy Markdown
Member

@ianna what's the plan with this PR? Is this something that you would want to merge soon or is it to keep like a live comparison of the "Awkward3" changes?

Don’t worry about the performance yet

If this is going in main, we should worry about performance.

@ianna
Copy link
Copy Markdown
Member Author

ianna commented May 28, 2026

@ianna what's the plan with this PR? Is this something that you would want to merge soon or is it to keep like a live comparison of the "Awkward3" changes?

The plan is to discuss it at AS meeting.

Don’t worry about the performance yet

If this is going in main, we should worry about performance.

ASIS performance is good enough. And we already know the bottlenecks -- where we can optimize it -- I want to have confirmation from the users that validation passes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants