Skip to content

feat: add benchmarks for awkward#3548

Closed
pfackeldey wants to merge 12 commits into
mainfrom
pfackeldey/benchmarks
Closed

feat: add benchmarks for awkward#3548
pfackeldey wants to merge 12 commits into
mainfrom
pfackeldey/benchmarks

Conversation

@pfackeldey
Copy link
Copy Markdown
Collaborator

This PR adds benchmarks to awkward.

For every PR it runs a suite of benchmarks and measures some performance metrics between main and the feature branch (merged with main). It reports any improvements & regressions of more than 10% in cpu_time.

It would be nice to extend this to run over all tags so we can track how performance evolves throughout all awkward releases.

An example notification for a performance regression of the ak.all operation (axis=None) running on a Jagged array with 65536 elements and dtype="float64" would look like this:
"""

🔹 ak.all(Jagged<65536,f64>, axis=None)

Relative CPU Time Difference: 692.1% — 🔴 Regression

Show full comparison
Metric benchmark.json benchmark2.json
cpu_time (ms) 1.712867e-01 1.356710e+00
real_time (ms) 1.712866e-01 1.047834e+02
elements/s (Hz) 3.83e+08 4.83e+07
"""

This is not yet including all "number-crunching" operations. What we want to benchmark (e.g. which highlevel ops) is up for discussion.

Opening this PR already now to debug the CI.

pfackeldey and others added 2 commits June 17, 2025 11:40
* debug test 1

* fix output piping for benchmark comparisons

* try handling multiline outputs

* hopefully fix paths

* another try to pass multiline output

* try fix json file names

* fix directory creation for benchmark results

* please precommit

* prettify table headers

* style: pre-commit fixes

* go back to original commit for comparison

* prettify branch name and SHA in table header

* style: pre-commit fixes

* try self-hosted runner

* another try with self-hosted runner

* another try

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@pfackeldey pfackeldey marked this pull request as ready for review June 18, 2025 19:10
@pfackeldey
Copy link
Copy Markdown
Collaborator Author

Hi @ianna,
I think this PR is ready for review. Once this is merged there's one extra CI job that runs benchmarks on our self hosted runner between the current main and the feature branch (merged with main). Potential regressions/improvements will be posted by the bot automatically to the PR like this: #3550 (comment)

The question is which things do we want to benchmark? Depending on how many functions we want to benchmark the CI job may become rather expensive?

Also: there may be false positives if the runner is under additional load that may affect the actual runtime. We can adjust the threshold of when the bot is supposed to post a comment of course in the future.

Copy link
Copy Markdown
Member

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pfackeldey - looks great! as discussed, it merits to be a separate repo. It would give us more flexibility to what we can profile. Thanks!

@pfackeldey
Copy link
Copy Markdown
Collaborator Author

Yes, I'll close this for now and use it as a starting point for a separate repo!

@pfackeldey pfackeldey closed this Jun 26, 2025
@ianna ianna deleted the pfackeldey/benchmarks branch August 2, 2025 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants