feat: complete CPU kernel migration from `parents` to `offsets` (#3988) by ianna · Pull Request #4056 · scikit-hep/awkward

ianna · 2026-05-16T19:55:27Z

back port from awkward3 branch

* next step in kernel migration from parents to offsets * linter fix * add jax bincount * add cpp kernel to convert parents to offsets * fix typo * fix typo in an auto-generated test * almost there * nearly there * cleanup * fix remaining kernels * final cleanup * format json data * format * initialize maxnextparents = -1 (a sentinel meaning "no bin was touched") * update the kernels to work on offsets! * format * migrate cupy rawkernels from parents to offsets * migrate jax reducers * fix for platfroms where the int64 counts can't be safely cast * add bincount for cupy backend * compact loc * fix windows build and add optional OpenMP support * update cuda kernels * feat: add `missing_repeat` kernel implementation using cuda.compute (#3922) * feat: add missing_repeat implementation using cuda.compute * keep the same name as in cpu kernel * add tests for repetitions>1 and regularsize>1 * style: pre-commit fixes * add keywords * style fixes --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * feat: add `index_rpad_and_clip` kernels implementations using cuda.compute (#3923) * feat: add `index_rpad_and_clip` kernels implementation using cuda.compute * style fix * add a test for `index_rpad_and_clip_axis0` that would have target>length * add keyword names * style * Apply suggestions from Ianna Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch> * style: pre-commit fixes --------- Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * fix: array attrs not being validated at creation and being of inconsistent type (#3996) * convert dict attrs to Attrs type * add test * fix: ensure jax backend uses arrays on cpu only (#3990) * ensure jax uses cpu by default * remove now redundant jax_platform_name setting * change error messages * better errors * remove DeviceArray mentions as that does not exist while we're at it too * almost there * cleanup * cleanup * remove old data file * update cupy kernels * remove exlicit test for depricated kernel * add complex kernels and port them to use offsets * fix complex reducers * add remaining kernels * add kernels for complex and bool sum * move to segmented_reduce * fix typo * promote type * fix complex bool reducer * try another algo * missed one * use type inference * make numba happy * remove reducer overloads * use sum op func for complex types * remove test of depricated code * avoid using == to compare floating-point products * fix boundary tests * cleanup * cleanup * remove dead code * revert lexsort to argmin/max * handrolled lexsort * remove dead code * remove bincount * import numba cuda for jit * fix tests --------- Co-authored-by: maxymnaumchyk <maxymnaumchyk@gmail.com> Co-authored-by: maxymnaumchyk <70752300+maxymnaumchyk@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Iason Krommydas <iason.krom@gmail.com>

ianna

@ikrommyd - please take your time to review this. Thanks!

codecov · 2026-05-16T20:01:13Z

Codecov Report

❌ Patch coverage is 96.48241% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.64%. Comparing base (35f3e29) to head (496e74f).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/awkward/contents/listoffsetarray.py	92.10%	3 Missing ⚠️
src/awkward/contents/unionarray.py	60.00%	2 Missing ⚠️
src/awkward/_connect/jax/reducers.py	95.65%	1 Missing ⚠️
src/awkward/_reducers.py	96.15%	1 Missing ⚠️

Additional details and impacted files

Files with missing lines	Coverage Δ
src/awkward/_backends/cupy.py	`100.00% <100.00%> (+4.05%)`	⬆️
src/awkward/_connect/cuda/_compute.py	`39.35% <ø> (-21.86%)`	⬇️
src/awkward/_connect/cuda/reducers.py	`61.11% <100.00%> (-19.89%)`	⬇️
src/awkward/_do.py	`84.14% <100.00%> (-0.56%)`	⬇️
src/awkward/contents/bitmaskedarray.py	`72.25% <100.00%> (ø)`
src/awkward/contents/bytemaskedarray.py	`88.61% <100.00%> (-0.03%)`	⬇️
src/awkward/contents/content.py	`77.45% <100.00%> (ø)`
src/awkward/contents/emptyarray.py	`78.28% <100.00%> (ø)`
src/awkward/contents/indexedarray.py	`86.56% <100.00%> (+1.20%)`	⬆️
src/awkward/contents/indexedoptionarray.py	`89.95% <100.00%> (ø)`
... and 10 more

... and 4 files with indirect coverage changes

ikrommyd · 2026-05-16T20:03:52Z

@ariostas should definitely take a look too (when he's back to wor). I do not fully trust myself with c++ yet but can definitely review the python part changes.

ikrommyd · 2026-05-16T20:04:46Z

One question I have is how are the kernel test data generated? I'm not familiar with this part.
Are they generated from running the kernels? i.e if you have a bug in a kernel, will it propagate to the test data jsons? Or we can trust the test data for kernel correctness?

ianna · 2026-05-16T20:14:13Z

One question I have is how are the kernel test data generated? I'm not familiar with this part. Are they generated from running the kernels? i.e if you have a bug in a kernel, will it propagate to the test data jsons? Or we can trust the test data for kernel correctness?

yes, we can trust the data because these are the data from the tests.

ikrommyd · 2026-05-17T06:19:48Z

I've ran this against HiggsDNA and getting the same results for my analysis. Only small floating point differences that np.isclose coverts (but array_equal errors). I'm pretty sure the kernel changes can cause such numerical differeneces.

ianna · 2026-05-17T06:29:27Z

I've ran this against HiggsDNA and getting the same results for my analysis. Only small floating point differences that np.isclose coverts (but array_equal errors). I'm pretty sure the kernel changes can cause such numerical differeneces.

Thanks @ikrommyd ! Floating point comparison is indeed a tricky one.

ikrommyd · 2026-05-17T06:35:56Z

Yeah totally reasonable. Floating point ops are not commutative.

Anyways I'll test this out a bit more with analyses examples and I also wanna if there's any noticeable performance hit.
Then I'll actually read the code.

ianna · 2026-05-17T06:50:16Z

Yeah totally reasonable. Floating point ops are not commutative.

Anyways I'll test this out a bit more with analyses examples and I also wanna see if there's any noticeable performance hit. Then I'll actually read the code.

Thanks! Don’t worry about the performance yet — that’s the next on my list. First we must insure that the results are correct.👍

github-actions · 2026-05-17T07:11:14Z

The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR4056

add benchmark script

* feat: add `awkward_IndexedArray_overlay_mask` kernel using cuda.compute * feat: add `awkward_IndexedArray_reduce_next_64` kernel using cuda.compute * feat: add `IndexedArray_reduce_next_nonlocal_nextshifts_64` kernel using cuda.compute * feat: add `ByteMaskedArray_getitem_nextcarry` kernel using cuda.compute * feat: add `awkward_ByteMaskedArray_numnull` kernel using cuda.compute * feat: add `awkward_RegularArray_getitem_jagged_expand` kernel using cuda.compute * add an upper bound * style: pre-commit fixes * feat: add `awkward_RegularArray_getitem_jagged_expand` kernel using cuda.compute * feat: add `awkward_UnionArray_simplify_one` kernel using cuda.compute * feat: add `awkward_ListArray_broadcast_tooffsets` kernel using cuda.compute * feat: add `awkward_ListArray_localindex` kernel using cuda.compute * fix the impl * feat: add `awkward_ListArray_compact_offsets` kernel using cuda.compute * feat: add `awkward_ListArray_combinations_length` kernel using cuda.compute * feat: add `awkward_ListArray_combinations` kernel using cuda.compute * feat: add `awkward_UnionArray_nestedfill_tags_index` kernel using cuda.compute * fix the tests for kernels that are deliberately raising errors * compare `starts` and `stops` separately * ignore `memptr` argument for pylint * style: pre-commit fixes * Add functions for indexing and repeating arrays * style: pre-commit fixes * return unary_transform call for segment_ids * style: pre-commit fixes * update the `awkward_IndexedArray_reduce_next_64` to work with offsets * style: pre-commit fixes --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch>

* feat: add reimplementation of the `awkward_RegularArray` kernels using cuda.compute * reimplement more kernels using cuda.compute * reimplement more kernels using cuda.compute * style: pre-commit fixes * style fix * use the named functions instead of lambdas * add new kernels * add new kernels * fix a type mismatch * style: pre-commit fixes * add new kernels * fix the overflow error * new batch of kernels (draft) * add comments and cleanup * fix the tests messages * add new `BitMaskedArray` kernels * modify the `test_1381_check_errors.py` test * resolve the conflicts --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

ariostas · 2026-05-28T13:56:23Z

@ianna what's the plan with this PR? Is this something that you would want to merge soon or is it to keep like a live comparison of the "Awkward3" changes?

Don’t worry about the performance yet

If this is going in main, we should worry about performance.

ianna · 2026-05-28T16:46:23Z

@ianna what's the plan with this PR? Is this something that you would want to merge soon or is it to keep like a live comparison of the "Awkward3" changes?

The plan is to discuss it at AS meeting.

Don’t worry about the performance yet

If this is going in main, we should worry about performance.

ASIS performance is good enough. And we already know the bottlenecks -- where we can optimize it -- I want to have confirmation from the users that validation passes.

ianna requested a review from ikrommyd May 16, 2026 19:57

ianna commented May 16, 2026

View reviewed changes

ianna and others added 3 commits May 18, 2026 23:50

chore: profile: add benchmark script (#4073)

6f6d816

add benchmark script

ikrommyd requested a review from ariostas May 28, 2026 03:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: complete CPU kernel migration from `parents` to `offsets` (#3988)#4056

feat: complete CPU kernel migration from `parents` to `offsets` (#3988)#4056
ianna wants to merge 4 commits into
mainfrom
awkward3

ianna commented May 16, 2026 •

edited

Loading

Uh oh!

ianna left a comment

Uh oh!

codecov Bot commented May 16, 2026 •

edited

Loading

Uh oh!

ikrommyd commented May 16, 2026

Uh oh!

ikrommyd commented May 16, 2026

Uh oh!

ianna commented May 16, 2026

Uh oh!

ikrommyd commented May 17, 2026

Uh oh!

ianna commented May 17, 2026

Uh oh!

ikrommyd commented May 17, 2026

Uh oh!

ianna commented May 17, 2026 •

edited by ikrommyd

Loading

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

ariostas commented May 28, 2026

Uh oh!

ianna commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ianna commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianna left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ikrommyd commented May 16, 2026

Uh oh!

ikrommyd commented May 16, 2026

Uh oh!

ianna commented May 16, 2026

Uh oh!

ikrommyd commented May 17, 2026

Uh oh!

ianna commented May 17, 2026

Uh oh!

ikrommyd commented May 17, 2026

Uh oh!

ianna commented May 17, 2026 • edited by ikrommyd Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

ariostas commented May 28, 2026

Uh oh!

ianna commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ianna commented May 16, 2026 •

edited

Loading

codecov Bot commented May 16, 2026 •

edited

Loading

ianna commented May 17, 2026 •

edited by ikrommyd

Loading