Skip to content

feat: class-based shuffler#108

Merged
ilan-gold merged 39 commits into
mainfrom
ig/class_shuffler
Jan 16, 2026
Merged

feat: class-based shuffler#108
ilan-gold merged 39 commits into
mainfrom
ig/class_shuffler

Conversation

@ilan-gold
Copy link
Copy Markdown
Collaborator

@ilan-gold ilan-gold commented Jan 13, 2026

Closes #94 and closes #66

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 14, 2026

Codecov Report

❌ Patch coverage is 90.57971% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.18%. Comparing base (a7d661d) to head (b8b9601).
⚠️ Report is 59 commits behind head on main.

Files with missing lines Patch % Lines
src/annbatch/io.py 90.32% 12 Missing ⚠️
src/annbatch/loader.py 88.88% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #108      +/-   ##
==========================================
+ Coverage   90.22%   92.18%   +1.95%     
==========================================
  Files           6        6              
  Lines         583      614      +31     
==========================================
+ Hits          526      566      +40     
+ Misses         57       48       -9     
Files with missing lines Coverage Δ
src/annbatch/__init__.py 100.00% <100.00%> (ø)
src/annbatch/utils.py 86.81% <100.00%> (+11.32%) ⬆️
src/annbatch/loader.py 91.92% <88.88%> (-0.11%) ⬇️
src/annbatch/io.py 94.24% <90.32%> (-0.58%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ilan-gold ilan-gold marked this pull request as ready for review January 14, 2026 15:40
@ilan-gold
Copy link
Copy Markdown
Collaborator Author

@felix0097 I am requesting your review but if you don't want to comb through the code changes that's fine. Please have a look at the docs/API changes and just make sure you're comfortable with them.

@ilan-gold ilan-gold requested a review from felix0097 January 14, 2026 15:48
Copy link
Copy Markdown
Collaborator

@felix0097 felix0097 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now!

To keep track this were the items we discussed:

  • Only support use_collection for now that the user isn't able to do weird stuff
  • Check if removing sample_rows function hurts performance for dense data -> if yes, potentially add it back in
  • By default, yield var as well

@ilan-gold
Copy link
Copy Markdown
Collaborator Author

ilan-gold commented Jan 15, 2026

Check if removing sample_rows function hurts performance for dense data -> if yes, potentially add it back in

#25 but let's wait for #101 to reimplement this much more simply

By default, yield var as well

#109

Only support use_collection for now that the user isn't able to do weird stuff

With one collection? yes that's in the implemention

@ilan-gold ilan-gold merged commit e051f31 into main Jan 16, 2026
9 checks passed
@ilan-gold ilan-gold deleted the ig/class_shuffler branch January 16, 2026 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor shuffling functions to be methods on one class called Shuffler Generalize preprocessing to handle custom chunking / X not as a dask Array

2 participants