Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR introduces a comprehensive mask builder framework for assembling tile-based image predictions into full masks, featuring multiple aggregation strategies (averaging, max), storage backends (in-memory, memory-mapped), receptive field manipulation (clipping, auto-scaling), and a factory for dynamic builder composition. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Factory as MaskBuilderFactory
participant Builder as Concrete Builder<br/>(e.g., ScalarMeanMB)
participant Storage as Storage Mixin<br/>(Array/MemMap)
participant Aggregator as Aggregator Mixin<br/>(Mean/Max)
Client->>Factory: create(use_memmap, aggregation,<br/>auto_scale, expand_scalars)
Factory-->>Client: Builder Class
Client->>Builder: __init__(extents, channels,<br/>tile_extents, strides, dtype)
Builder->>Storage: allocate_accumulator()
Storage-->>Builder: accumulator
Builder->>Builder: setup_memory()
loop For each batch of tiles
Client->>Builder: update_batch(data_batch,<br/>coords_batch)
Builder->>Builder: transform coordinates<br/>(expand/rescale if needed)
Builder->>Aggregator: update_batch_impl()
Aggregator->>Storage: accumulator[slices] OP= data
Storage-->>Aggregator: updated
end
Client->>Builder: finalize()
Builder->>Aggregator: finalize()
Aggregator-->>Builder: mask (and overlap_counter<br/>if averaging)
Builder-->>Client: assembled mask(s)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes The PR introduces a substantial new subsystem with intricate mixin-based composition, multiple interacting abstract classes, platform-specific dependencies, and extensive public API surface expansion (23+ new classes). While individual components follow clear patterns, the cooperative MRO semantics, memory management strategies (memmap cleanup), and factory logic require careful verification across cohorts. The extensive test suite aids review but demands validation of coordinate transformations, batch accumulation logic, and overlap handling correctness. Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @AdamBajger, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the mask building infrastructure by introducing a highly modular and extensible Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a well-structured, mixin-based architecture for mask builders, which is a great improvement for modularity and reusability. The separation of concerns into aggregation, storage, and receptive field manipulation is clean. However, there are several critical issues, mostly related to missing imports, that will prevent the code from running. There are also some bugs, particularly in temporary file handling for memory-mapped arrays, and several inconsistencies in docstrings and tests. Once these issues are addressed, this will be a solid contribution.
There was a problem hiding this comment.
Actionable comments posted: 14
🤖 Fix all issues with AI agents
In @pyproject.toml:
- Line 30: The dependency entry for jaxtyping is missing a version constraint;
update the dependency list by replacing the bare "jaxtyping" entry with a
minimum-version specifier "jaxtyping>=0.3.5" so it matches the project's
convention of pinning minimum versions like numpy>=2.2.2 and torch>=2.6.0.
In @ratiopath/masks/mask_builders/__init__.py:
- Around line 231-245: The override of setup_memory in the class does all
allocation itself (calls allocate_accumulator and sets self.accumulator and
self.overlap_counter) but does not call super().setup_memory(...); either add a
call to super().setup_memory(mask_extents, channels, accumulator_filepath,
overlap_counter_filepath, **kwargs) at the start (or end) to preserve
cooperative mixin behavior, or add a short docstring/comment on this
setup_memory method explaining why the base implementation is intentionally
bypassed; ensure references to allocate_accumulator, self.accumulator and
self.overlap_counter remain correct if you call super().
- Around line 30-60: Update the docstring examples to use the correct package
name "ratiopath" and call the mask builder API using update_batch(data_batch,
coords_batch) instead of update_batch(features, xs, ys); specifically, change
imports to from ratiopath.masks.mask_builders import
AveragingScalarUniformTiledNumpyMaskBuilder, ensure generate_tiles_from_slide
remains as the tile producer, and pass coords_batch as a single value (e.g., a
tuple or list containing xs and ys) when calling
mask_builder.update_batch(data_batch, coords_batch); apply the same fixes to the
other examples mentioned (lines 99-127 and 175-205) and ensure the finalize()
usage and assembled_mask/overlap variable names remain consistent.
- Around line 1-9: The module uses the jaxtyping type alias Int64 in multiple
annotations but never imports it; add "Int64" from jaxtyping to the imports and
reorder imports into stdlib, third-party, then local groups. Specifically,
update the top-level import block to include Int64 (so Int64 is available for
annotations referenced throughout the file) and ensure import ordering follows
the stdlib / third-party / local convention while keeping existing symbols like
AveragingMaskBuilderMixin, MaxMaskBuilderMixin, AccumulatorType,
AutoScalingConstantStrideMixin, EdgeClippingMaskBuilder2DMixin,
ScalarUniformTiledMaskBuilder, NumpyArrayMaskBuilderAllocatorMixin, and
NumpyMemMapMaskBuilderAllocatorMixin unchanged.
In @ratiopath/masks/mask_builders/aggregation.py:
- Around line 65-104: MaxMaskBuilderMixin must override setup_memory() to
reinitialize the allocated accumulator to the proper minimum value (not zeros)
so max-reduction with negative tiles isn't clamped to 0: call
super().setup_memory(), obtain the accumulator dtype, compute the dtype min (use
np.finfo for floats and np.iinfo for ints), and then set every element of
self.accumulator to that min (via fill or np.full/assignment) so subsequent
np.maximum calls produce correct results for negative values.
- Around line 1-5: Add the missing imports: import numpy as np (used by
np.asarray and np.maximum) and import the Int64 and Shaped type names from the
project typing module (e.g., from ratiopath.typing import Int64, Shaped) so the
annotations and calls referencing Int64 and Shaped compile; update the top of
the file alongside the existing AccumulatorType, MaskBuilder, compute_acc_slices
import so np, Int64, and Shaped are available where functions like
compute_acc_slices, AccumulatorType, and MaskBuilder use them.
In @ratiopath/masks/mask_builders/mask_builder.py:
- Line 2: Remove the unused import of the math module in mask_builder.py: delete
the line importing math at the top of
ratiopath/masks/mask_builders/mask_builder.py since no functions or classes in
this file reference math; ensure no other references to math exist in functions
or classes within this module (e.g., any methods in MaskBuilder or related
helpers) before removing.
In @ratiopath/masks/mask_builders/receptive_field_manipulation.py:
- Line 56: Rename the misspelled variable adjusterd_coords_batch to
adjusted_coords_batch where it is defined and used (in the expression
coords_batch + self.clip_start_indices[:, np.newaxis]) so all references in
receptive_field_manipulation.py consistently use adjusted_coords_batch; update
any subsequent usages of adjusterd_coords_batch to the corrected name.
- Around line 1-4: The import ordering is incorrect: move the standard
library/third-party import "from typing import Any" to appear before the local
module import "from ratiopath.masks.mask_builders.mask_builder import
AccumulatorType, MaskBuilder", remove the extra blank line between imports, and
ensure imports follow the standard group order (stdlib, third-party, local) to
satisfy the linter/isort rules.
In @ratiopath/masks/mask_builders/storage.py:
- Around line 1-7: Imports in storage.py are mis-ordered (the local package
import from ratiopath.masks.mask_builders.mask_builder appears before
standard-library imports), causing linter I001; reorder the imports so
standard-library modules (tempfile, pathlib.Path, typing.Any) come first, then
third-party (if any), then local package imports (the line importing
AccumulatorType and MaskBuilder), remove the extra blank line between them, and
then run the project formatter/import sorter (e.g., isort/black) to satisfy CI.
- Around line 1-7: The file is missing numpy and typing imports (e.g., import
numpy as np, import numpy.typing as npt, and Int64 from pandas or appropriate
module) and allocate_accumulator incorrectly creates an np.memmap via
tempfile.NamedTemporaryFile which deletes the backing file when the context
exits; update imports to include np, npt, and Int64 (or the correct alias used
elsewhere) and change allocate_accumulator to create a stable tempfile path
(e.g., use tempfile.mkstemp or NamedTemporaryFile(delete=False) or
tempfile.TemporaryDirectory) and call np.lib.format.open_memmap or np.memmap on
that persistent path, ensure the file handle is closed and the path returned
alongside the memmap or the file is not deleted on context exit so the returned
memmap remains valid (modify function names allocate_accumulator and any use of
open_memmap/np.memmap accordingly).
In @ratiopath/masks/write_big_tiff.py:
- Around line 45-46: Validate the mpp inputs before computing xres/yres: check
that mpp_x and mpp_y are provided and greater than zero (e.g., in the same
function where xres = 1000 / mpp_x and yres = 1000 / mpp_y, likely in
write_big_tiff or its caller), and if not raise a clear ValueError (or
TypeError) with a message like "mpp_x/mpp_y must be positive numbers"; this
prevents ZeroDivisionError and invalid resolutions and ensures downstream code
receives valid resolution values.
In @tests/test_mask_builders.py:
- Around line 259-299: The test test_numpy_memmap_tempfile_management currently
only checks the temp file is gone after del assembler and doesn't verify that
the temp file existed while the builder was alive; update the test to assert
that the captured path in captured_files (from intercepting_namedtempfile)
exists immediately after constructing
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D (and/or after calling
assembler.update_batch) before deleting the assembler, so you prove the
memmap/tempfile was created and lived during the builder's lifetime; if the
builder unlinks immediately on creation, adjust the expectation to reflect that
behavior and assert accordingly.
- Around line 1-12: Tests import block is unsorted causing I001 lint; run the
project's import sorter (e.g., isort/ruff -a or the configured formatter) on
tests/test_mask_builders.py to reorder and group imports: standard library
(tempfile, Path), third-party (numpy as np, pytest), then local package imports.
Also combine multiple imports from ratiopath.masks.mask_builders into
consolidated, alphabetized import statements (including
AveragingScalarUniformTiledNumpyMaskBuilder,
MaxScalarUniformTiledNumpyMaskBuilder,
AutoScalingScalarUniformValueConstantStrideMaskBuilder,
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D) so the import block
follows the project's style.
🧹 Nitpick comments (4)
ratiopath/masks/write_big_tiff.py (1)
39-43: Adjust stacklevel for more useful warning location.The
stacklevel=1points to this function itself. Usingstacklevel=2would point to the caller, making the warning more actionable.📍 Proposed fix
- "Make sure the resolution is in µm/pixel not in pixels/mm.", stacklevel=1 + "Make sure the resolution is in µm/pixel not in pixels/mm.", stacklevel=2ratiopath/masks/mask_builders/mask_builder.py (1)
84-96: Consider documenting dtype handling in setup_memory.The method calls
allocate_accumulatorwithout explicitly passingdtype, relying on it being provided via**kwargsor using the default in the implementation. While this works (as seen in storage.py implementations withdtype=np.float32), explicitly documenting this behavior would improve clarity.Consider adding a note to the docstring:
def setup_memory(self, mask_extents, channels, **kwargs) -> None: """This method sets up memory structures needed for mask building. ...existing documentation... Note: The dtype for the accumulator can be specified via kwargs and will default to np.float32 if not provided. """ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
162-199: Consider validating source_extents >= source_tile_extents.The calculation at line 185 assumes
source_extents >= source_tile_extents. If this assumption is violated,total_stridescould be negative, leading to unexpected behavior. Adding an explicit check would make the constraint clear and fail fast with a helpful error message.🛡️ Proposed validation
) -> None: + if not np.all(source_extents >= source_tile_extents): + raise ValueError( + f"source_extents must be >= source_tile_extents in all dimensions, " + f"got source_extents={source_extents}, source_tile_extents={source_tile_extents}" + ) + # self.source_extents = source_extents self.source_tile_extents = source_tile_extentsratiopath/masks/mask_builders/__init__.py (1)
258-263: Decide whetherAutoScalingScalarUniformValueConstantStrideMaskBuildershould be in__all__.
Tests import it explicitly (works), but__all__suggests it’s not part of the intended public surface.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
pyproject.tomlratiopath/masks/mask_builders/__init__.pyratiopath/masks/mask_builders/aggregation.pyratiopath/masks/mask_builders/mask_builder.pyratiopath/masks/mask_builders/receptive_field_manipulation.pyratiopath/masks/mask_builders/storage.pyratiopath/masks/write_big_tiff.pytests/conftest.pytests/test_mask_builders.py
🧰 Additional context used
🧬 Code graph analysis (6)
tests/test_mask_builders.py (4)
ratiopath/masks/mask_builders/__init__.py (3)
AveragingScalarUniformTiledNumpyMaskBuilder(11-77)MaxScalarUniformTiledNumpyMaskBuilder(80-145)AutoScalingScalarUniformValueConstantStrideMaskBuilder(265-323)ratiopath/masks/mask_builders/mask_builder.py (2)
update_batch(98-119)finalize(145-158)ratiopath/masks/mask_builders/receptive_field_manipulation.py (3)
update_batch(41-60)update_batch(201-209)update_batch(257-268)ratiopath/masks/mask_builders/aggregation.py (2)
finalize(59-62)finalize(103-104)
ratiopath/masks/mask_builders/mask_builder.py (4)
ratiopath/masks/mask_builders/__init__.py (1)
setup_memory(231-244)ratiopath/masks/mask_builders/aggregation.py (5)
setup_memory(28-33)update_batch_impl(35-57)update_batch_impl(76-100)finalize(59-62)finalize(103-104)ratiopath/masks/mask_builders/storage.py (2)
allocate_accumulator(29-55)allocate_accumulator(65-72)ratiopath/masks/mask_builders/receptive_field_manipulation.py (3)
update_batch(41-60)update_batch(201-209)update_batch(257-268)
ratiopath/masks/mask_builders/aggregation.py (3)
ratiopath/masks/mask_builders/mask_builder.py (6)
MaskBuilder(43-158)compute_acc_slices(18-40)setup_memory(84-96)allocate_accumulator(74-82)update_batch_impl(122-142)finalize(145-158)ratiopath/masks/mask_builders/__init__.py (1)
setup_memory(231-244)ratiopath/masks/mask_builders/storage.py (2)
allocate_accumulator(29-55)allocate_accumulator(65-72)
ratiopath/masks/mask_builders/storage.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
MaskBuilder(43-158)allocate_accumulator(74-82)
ratiopath/masks/mask_builders/__init__.py (4)
ratiopath/masks/mask_builders/aggregation.py (3)
AveragingMaskBuilderMixin(7-62)MaxMaskBuilderMixin(65-104)setup_memory(28-33)ratiopath/masks/mask_builders/receptive_field_manipulation.py (2)
AutoScalingConstantStrideMixin(119-209)EdgeClippingMaskBuilder2DMixin(63-116)ratiopath/masks/mask_builders/storage.py (4)
NumpyArrayMaskBuilderAllocatorMixin(58-72)NumpyMemMapMaskBuilderAllocatorMixin(9-55)allocate_accumulator(29-55)allocate_accumulator(65-72)ratiopath/masks/mask_builders/mask_builder.py (2)
setup_memory(84-96)allocate_accumulator(74-82)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
MaskBuilder(43-158)update_batch(98-119)
🪛 GitHub Actions: Lint
tests/test_mask_builders.py
[warning] 1-1: I001 Import block is un-sorted or un-formatted. Organize imports.
ratiopath/masks/mask_builders/mask_builder.py
[warning] 2-2: F401 Unused import 'math'. Remove to fix lint.
ratiopath/masks/mask_builders/aggregation.py
[warning] 24-25: F821 Undefined name 'Int64' (likely missing import/alias).
ratiopath/masks/mask_builders/storage.py
[warning] 1-1: I001 Import block is un-sorted or un-formatted. Organize imports.
ratiopath/masks/mask_builders/__init__.py
[error] 65-70: Undefined name 'Int64' (F821) reported by Ruff in multiple type hints.
ratiopath/masks/mask_builders/receptive_field_manipulation.py
[warning] 1-1: I001 Import block is un-sorted or un-formatted. Organize imports.
🪛 GitHub Actions: Tests
tests/test_mask_builders.py
[error] 1-1: pytest collection failed due to missing libvips dependencies. ModuleNotFoundError: No module named '_libvips'. Additionally, OSError: cannot load library 'libvips.so.42' during import of pyvips. Process completed with exit code 2. Command: uv run pytest.
🔇 Additional comments (6)
tests/conftest.py (1)
5-8: LGTM!The fixture correctly establishes deterministic behavior for NumPy-based tests.
ratiopath/masks/mask_builders/mask_builder.py (1)
18-40: LGTM!The function correctly computes slice objects for accumulator indexing. The logic is clear and the implementation handles batch processing efficiently.
ratiopath/masks/mask_builders/receptive_field_manipulation.py (2)
89-116: LGTM!The parameter parsing logic correctly handles all three clip formats (single int, tuple of 2, tuple of 4) and provides clear error messages for invalid inputs.
242-268: LGTM!The GCD-based compression strategy is elegant and correctly reduces memory usage for uniform tiles. The broadcasting logic in
update_batchcorrectly expands scalar values into tiles.tests/test_mask_builders.py (2)
14-75: > Likely an incorrect or invalid review comment.
1-12: Remove incorrect CI diagnosis; address minor import and seeding issues.The imports from
ratiopath.masks.mask_buildersdo not triggerpyvipsunconditional imports—consolidate multiple imports from the same module into a single statement (lines 7–11). Additionally, the explicitnp.random.seed(0)at line 155 overrides the autouse fixture seed; either remove it to rely on the fixture or document the intentional override.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Pull request overview
This PR introduces a comprehensive mask building framework for assembling large masks from tiled data. The implementation provides a flexible, modular architecture using multiple inheritance patterns with mixins.
Key changes include:
- A base
MaskBuilderclass with abstract interfaces for accumulator allocation, batch updates, and finalization - Mixin classes for different aggregation strategies (averaging, max-pooling), storage backends (numpy arrays, memory-mapped files), and preprocessing operations (edge clipping, auto-scaling)
- Concrete builder classes combining mixins for specific use cases
- BigTIFF export functionality via pyvips
- Comprehensive test suite covering various parameter combinations
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
ratiopath/masks/mask_builders/mask_builder.py |
Defines base abstract class with cooperative multiple inheritance pattern for mask building |
ratiopath/masks/mask_builders/aggregation.py |
Implements averaging and max aggregation mixins for overlapping tiles |
ratiopath/masks/mask_builders/receptive_field_manipulation.py |
Provides edge clipping and auto-scaling mixins for coordinate transformation |
ratiopath/masks/mask_builders/storage.py |
Defines storage allocation strategies (numpy arrays and memory-mapped files) |
ratiopath/masks/mask_builders/__init__.py |
Exports concrete builder classes combining various mixins |
ratiopath/masks/write_big_tiff.py |
Adds utility function for exporting masks as BigTIFF files with pyramids |
tests/test_mask_builders.py |
Comprehensive test suite covering various builder configurations and edge cases |
tests/conftest.py |
Adds pytest fixture for deterministic random seed management |
pyproject.toml |
Adds jaxtyping dependency for type annotations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Fix all issues with AI agents
In @ratiopath/masks/mask_builders/__init__.py:
- Around line 1-146: The examples in the docstrings for
AveragingScalarUniformTiledNumpyMaskBuilder and
MaxScalarUniformTiledNumpyMaskBuilder are out of sync: update imports to use
ratiopath (not rationai) and change the update_batch call to the new signature
by constructing a single coords_batch (stack ys and xs into shape (2, B) or (N,
B) as required) and call update_batch(data_batch=features,
coords_batch=coords_batch); keep the rest of the example text but replace the
old update_batch(features, xs, ys) usage and show building coords via
np.stack([ys, xs], axis=0) (or the correct (y,x) order) before calling
update_batch.
- Around line 231-256: Update the get_vips_scale_factors docstring to include a
blank line before the "Returns:" section to satisfy D411, and ensure the
returned values are plain Python floats (not NumPy scalar types) by casting the
computed scale_factors derived from self.overflow_buffered_source_extents /
self.accumulator.shape[1:] to float before returning (e.g., convert each element
of scale_factors to float and return a tuple of two floats).
In @ratiopath/masks/mask_builders/aggregation.py:
- Around line 1-65: Remove the unused "numpy.typing as npt" import and reorder
imports to stdlib → third-party → relative (move the local ratiopath import
after third-party numpy/jaxtyping), and change
AveragingMaskBuilderMixin.finalize to avoid allocating a clipped temporary by
performing in-place guarded division using numpy.divide with out= and where=
(divide self.accumulator by self.overlap_counter in-place where overlap_counter
!= 0), then return the accumulators as before.
In @ratiopath/masks/mask_builders/receptive_field_manipulation.py:
- Around line 1-62: Rename the misspelled local variable adjusterd_coords_batch
to adjusted_coords_batch in EdgeClippingMaskBuilderMixin.update_batch and update
its use in the super().update_batch call; also reorder imports to follow stdlib
→ third-party → local (e.g., from typing import Any; from jaxtyping import
Shaped, Int64; import numpy as np; then from
ratiopath.masks.mask_builders.mask_builder import AccumulatorType, MaskBuilder),
removing extraneous blank lines so import groups are grouped correctly.
- Around line 213-269: Convert mask_extents, mask_tile_extents, and
mask_tile_strides to numpy arrays at the start of
ScalarUniformTiledMaskBuilder.__init__ (e.g., via np.asarray(..., dtype=int)) to
defensively handle tuple inputs and ensure subsequent array ops (np.gcd, //)
behave as expected; then proceed to compute self.compression_factors,
adjusted_mask_extents, adjusted_tile_extents and call super() as before, leaving
update_batch unchanged.
- Around line 120-211: The constructor of AutoScalingConstantStrideMixin assumes
NumPy array semantics but accepts tuple inputs; coerce source_extents,
source_tile_extents, source_tile_strides, and mask_tile_extents to NumPy arrays
(e.g. via np.asarray(..., dtype=np.int64) or np.array(..., dtype=np.int64)) at
the very start of __init__() so subsequent arithmetic (the multiplied_ check,
total_strides calc, overflow_buffered_* math, and update_batch scaling) works
reliably; update references to self.source_tile_extents and
self.mask_tile_extents after coercion (matching the pattern used in
EdgeClippingMaskBuilderMixin).
In @ratiopath/masks/mask_builders/storage.py:
- Around line 1-56: The allocate_accumulator implementation in
NumpyMemMapMaskBuilderAllocatorMixin is unsafe on Windows and imports are
misordered: replace the tempfile.NamedTemporaryFile context with an
instance-level TemporaryDirectory kept as self._memmap_tmpdir, create temp files
inside that dir with delete=False, close the file handle before calling
np.lib.format.open_memmap, and keep the backing file alive for the object's
lifetime so memmaps can be reopened (use open_memmap on the file path string);
also reorder imports so stdlib (pathlib, typing, tempfile) come first, then
third-party (numpy, numpy.typing, jaxtyping), then local
(ratiopath.masks.mask_builders.mask_builder).
In @ratiopath/masks/write_big_tiff.py:
- Around line 6-51: The write_big_tiff function must validate mpp_x and mpp_y
and pass a string filename to pyvips: check that mpp_x > 0 and mpp_y > 0 and
raise a ValueError with a clear message if not, compute xres/yres as before
(1000 / mpp_x, 1000 / mpp_y), and call image.tiffsave using str(path) instead of
the Path object; update references to path, mpp_x, mpp_y, write_big_tiff, and
image.tiffsave accordingly.
🧹 Nitpick comments (1)
ratiopath/masks/mask_builders/aggregation.py (1)
67-106: Use in-place max to avoid per-tile temporary allocations.
np.maximum(a, b)allocates; for large tiles/batches this can add up. Useout=against the accumulator view.Proposed fix
for acc_slices, data in zip( zip(*acc_slices_all_dims, strict=True), data_batch, strict=True, ): - self.accumulator[ - :, - *acc_slices, - ] = np.maximum( - self.accumulator[ - :, - *acc_slices, - ], - data, - ) + acc_view = self.accumulator[:, *acc_slices] + np.maximum(acc_view, data, out=acc_view)
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
ratiopath/masks/mask_builders/__init__.pyratiopath/masks/mask_builders/aggregation.pyratiopath/masks/mask_builders/receptive_field_manipulation.pyratiopath/masks/mask_builders/storage.pyratiopath/masks/write_big_tiff.py
🧰 Additional context used
🧬 Code graph analysis (3)
ratiopath/masks/mask_builders/storage.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
MaskBuilder(43-158)allocate_accumulator(74-82)
ratiopath/masks/mask_builders/aggregation.py (3)
ratiopath/masks/mask_builders/mask_builder.py (6)
MaskBuilder(43-158)compute_acc_slices(18-40)setup_memory(84-96)allocate_accumulator(74-82)update_batch_impl(122-142)finalize(145-158)ratiopath/masks/mask_builders/__init__.py (1)
setup_memory(231-244)ratiopath/masks/mask_builders/storage.py (2)
allocate_accumulator(30-56)allocate_accumulator(66-73)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
MaskBuilder(43-158)update_batch(98-119)
🪛 GitHub Actions: Lint
ratiopath/masks/mask_builders/storage.py
[error] 1-1: I001: Import block is un-sorted or un-formatted
ratiopath/masks/mask_builders/aggregation.py
[error] 1-1: I001: Import block is un-sorted or un-formatted
[error] 4-4: F401: numpy.typing imported but unused
ratiopath/masks/mask_builders/__init__.py
[error] 1-1: I001: Import block is un-sorted or un-formatted
[error] 251-251: D411: Missing blank line before section ("Returns")
ratiopath/masks/mask_builders/receptive_field_manipulation.py
[error] 1-1: I001: Import block is un-sorted or un-formatted
🔇 Additional comments (1)
ratiopath/masks/mask_builders/__init__.py (1)
258-324: Export list may be incomplete (intent check):AutoScalingScalarUniformValueConstantStrideMaskBuildernot in__all__.If this class is meant to be part of the public surface (like the other new builders), add it to
__all__; otherwise, consider moving it to a private module or prefixing with_to signal non-public intent.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @ratiopath/masks/mask_builders/receptive_field_manipulation.py:
- Line 62: There is an extra closing parenthesis causing a syntax error in the
receptive_field_manipulation module; remove the stray ")" (the unmatched
parenthesis shown at the end of the expression around line 62) so the
surrounding expression or return statement in the receptive_field_manipulation
code block compiles correctly (locate the unmatched ")" near the end of the
function or expression in mask_builders/receptive_field_manipulation.py and
delete it).
In @tests/test_mask_builders.py:
- Around line 1-11: Reorder the import block in tests/test_mask_builders.py so
it follows linting import groups and alphabetical order (standard library first:
tempfile, pathlib.Path; then third-party: numpy as np, pytest; then local
package imports from ratiopath.masks.mask_builders) — ensure the multiple class
imports from ratiopath.masks.mask_builders are grouped and alphabetized
(AveragingScalarUniformTiledNumpyMaskBuilder,
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D,
AutoScalingScalarUniformValueConstantStrideMaskBuilder,
MaxScalarUniformTiledNumpyMaskBuilder), then run your import fixer (e.g., ruff
or isort) as suggested: ruff check --select I --fix tests/test_mask_builders.py.
- Around line 317-331: The test creates an overlap counter file using
overlap_counter_filepath=filepath.with_suffix(".overlaps"+filepath.suffix) but
asserts existence of filepath.with_suffix(".overlap"+filepath.suffix) (missing
's'); update the assertion to check for
filepath.with_suffix(".overlaps"+filepath.suffix) so the expected filename
matches the one created (references: overlap_counter_filepath, filepath,
assembler.update_batch).
🧹 Nitpick comments (8)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (6)
30-40: Validate clip indices to prevent invalid configurations.The constructor doesn't validate that
clip_start_indicesandclip_end_indicesare non-negative or that their sum doesn't exceed tile dimensions. Invalid values could cause incorrect slicing or runtime errors.✅ Suggested validation
def __init__( self, mask_extents: Int64[AccumulatorType, " N"], channels: int, clip_start_indices: Int64[AccumulatorType, " N"], clip_end_indices: Int64[AccumulatorType, " N"], **kwargs: Any, ) -> None: super().__init__(mask_extents, channels, **kwargs) self.clip_start_indices = np.asarray(clip_start_indices, dtype=np.int64) self.clip_end_indices = np.asarray(clip_end_indices, dtype=np.int64) + + if np.any(self.clip_start_indices < 0) or np.any(self.clip_end_indices < 0): + raise ValueError("Clip indices must be non-negative") + if len(self.clip_start_indices) != len(mask_extents) or len(self.clip_end_indices) != len(mask_extents): + raise ValueError(f"Clip indices must match mask dimensions: {len(mask_extents)}")
98-99: Validate dimensionality matches 2D expectation.The class is named
EdgeClippingMaskBuilder2DMixin, but line 99 applies integer clip values to all dimensions inmask_extentswithout verifying it's actually 2D. This could silently accept 3D+ masks with unexpected behavior.✅ Suggested validation
def __init__( self, mask_extents: Int64[AccumulatorType, " N"], channels: int, clip: int | tuple[int, int] | tuple[int, int, int, int] = 0, **kwargs: Any, ) -> None: + if len(mask_extents) != 2: + raise ValueError(f"EdgeClippingMaskBuilder2DMixin requires 2D mask_extents, got {len(mask_extents)} dimensions") + if isinstance(clip, int): + if clip < 0: + raise ValueError(f"Clip value must be non-negative, got {clip}") clip_start_indices = clip_end_indices = (clip,) * len(mask_extents)
187-188: Consider integer arithmetic to avoid floating-point precision issues.The current approach uses float division followed by ceiling and conversion to int64. For large extent/stride values, floating-point precision could introduce errors.
♻️ Alternative using integer arithmetic
- total_strides = (source_extents - source_tile_extents) / source_tile_strides - total_strides = np.ceil(total_strides).astype(np.int64) + # Integer ceiling division: (a + b - 1) // b + total_strides = (source_extents - source_tile_extents + source_tile_strides - 1) // source_tile_strides
164-176: Validate input parameters are positive and well-formed.The constructor doesn't validate that extents and strides are positive, or that
source_tile_extents <= source_extents. Invalid inputs could cause confusing errors downstream.✅ Suggested validation
def __init__( self, source_extents: Int64[AccumulatorType, " N"], source_tile_extents: Int64[AccumulatorType, " N"], source_tile_strides: Int64[AccumulatorType, " N"], mask_tile_extents: Int64[AccumulatorType, " N"], channels: int, **kwargs: Any, ) -> None: + source_extents = np.asarray(source_extents, dtype=np.int64) + source_tile_extents = np.asarray(source_tile_extents, dtype=np.int64) + source_tile_strides = np.asarray(source_tile_strides, dtype=np.int64) + mask_tile_extents = np.asarray(mask_tile_extents, dtype=np.int64) + + if np.any(source_extents <= 0) or np.any(source_tile_extents <= 0) or np.any(source_tile_strides <= 0) or np.any(mask_tile_extents <= 0): + raise ValueError("All extents and strides must be positive") + if np.any(source_tile_extents > source_extents): + raise ValueError(f"source_tile_extents {source_tile_extents} cannot exceed source_extents {source_extents}") + # self.source_extents = source_extents - self.source_tile_extents = source_tile_extents - self.mask_tile_extents = mask_tile_extents + self.source_tile_extents = source_tile_extents + self.mask_tile_extents = mask_tile_extents
265-268: Simplify tile expansion using np.tile or np.broadcast_to.Creating a zeros array and then adding the scalar values works but is less efficient and less clear than directly expanding the values.
♻️ More efficient alternatives
def update_batch( self, data_batch: Shaped[AccumulatorType, "B C"], coords_batch: Shaped[AccumulatorType, "N B"], ) -> None: """For each scalar/vector in the batch, repeat it in each dimension to form a tile, then update the mask with the tile.""" - adjusted_tiles = np.zeros((*data_batch.shape, *self.adjusted_tile_extents)) - adjusted_tiles += data_batch[ - ..., *[np.newaxis] * len(self.adjusted_tile_extents) - ] + # Option 1: Using np.broadcast_to (view, no copy) + adjusted_tiles = np.broadcast_to( + data_batch[..., *[np.newaxis] * len(self.adjusted_tile_extents)], + (*data_batch.shape, *self.adjusted_tile_extents) + ) + # Note: broadcast_to returns a view, so we need a copy if super().update_batch modifies it + adjusted_tiles = np.array(adjusted_tiles) + + # Option 2: Using np.tile (if repeating is intended) + # adjusted_tiles = np.tile(data_batch[..., *[np.newaxis] * len(self.adjusted_tile_extents)], (1, 1, *self.adjusted_tile_extents)) + adjusted_coordinates = coords_batch // self.compression_factors[:, np.newaxis] super().update_batch(adjusted_tiles, coords_batch=adjusted_coordinates)
244-252: Validate tile parameters for consistency.The constructor doesn't validate that
mask_tile_extentsandmask_tile_stridesare positive, or that the GCD computation produces valid compression factors.✅ Suggested validation
def __init__( self, mask_extents: Int64[AccumulatorType, " N"], channels: int, mask_tile_extents: Int64[AccumulatorType, " N"], mask_tile_strides: Int64[AccumulatorType, " N"], **kwargs: Any, ) -> None: + mask_extents = np.asarray(mask_extents, dtype=np.int64) + mask_tile_extents = np.asarray(mask_tile_extents, dtype=np.int64) + mask_tile_strides = np.asarray(mask_tile_strides, dtype=np.int64) + + if np.any(mask_extents <= 0) or np.any(mask_tile_extents <= 0) or np.any(mask_tile_strides <= 0): + raise ValueError("All extents and strides must be positive") + if len(mask_extents) != len(mask_tile_extents) or len(mask_extents) != len(mask_tile_strides): + raise ValueError("mask_extents, mask_tile_extents, and mask_tile_strides must have the same dimensions") + self.compression_factors = np.gcd(mask_tile_strides, mask_tile_extents)tests/test_mask_builders.py (2)
333-335: Clean up the overlap file in the test teardown.The test only unlinks the main file but leaves the overlap counter file behind, which could clutter the test directory.
🧹 Proposed cleanup
# Clean up filepath.unlink() + filepath.with_suffix(".overlap"+filepath.suffix).unlink()
353-354: Remove unnecessary blank lines.Lines 353-354 contain extra blank lines that don't improve readability.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
pyproject.tomlratiopath/masks/mask_builders/receptive_field_manipulation.pytests/test_mask_builders.py
🚧 Files skipped from review as they are similar to previous changes (1)
- pyproject.toml
🧰 Additional context used
🪛 GitHub Actions: Lint
tests/test_mask_builders.py
[error] 1-1: I001 Import block is un-sorted or un-formatted. Organize imports.
ratiopath/masks/mask_builders/receptive_field_manipulation.py
[error] 62-62: invalid-syntax: Expected a statement.
[error] 62-62: invalid-syntax: Expected a statement.
🪛 GitHub Actions: Tests
tests/test_mask_builders.py
[error] 7-7: ImportError during test collection: ModuleNotFoundError: No module named '_libvips' followed by OSError: cannot load library 'libvips.so.42'. Ensure libvips and pyvips are properly installed in the CI environment.
🔇 Additional comments (1)
tests/test_mask_builders.py (1)
155-155: Good practice: deterministic testing with random seed.Setting
np.random.seed(0)ensures reproducible test results across runs, which is helpful for debugging and CI stability.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In @ratiopath/masks/mask_builders/__init__.py:
- Around line 268-327: The docstring example uses the wrong class name; replace
the instantiation of AutoScalingScalarUniformTiledMaskBuilder with the actual
exported class AutoScalingScalarUniformValueConstantStrideMaskBuilder (or
vice‑versa if you intended a different public API) so the example in the
docstring of class AutoScalingScalarUniformValueConstantStrideMaskBuilder
matches the real class name and parameters (source_extents, channels,
source_tile_extents, source_tile_strides, mask_tile_extents).
- Around line 12-17: The __all__ list in this module is unsorted; sort the
entries in __all__ alphabetically to satisfy RUF022 and keep the public API
tidy—i.e., reorder the referenced symbols
(AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D,
AutoScalingScalarUniformValueConstantStrideMaskBuilder,
AveragingScalarUniformTiledNumpyMaskBuilder,
MaxScalarUniformTiledNumpyMaskBuilder) into alphabetical order and update the
__all__ assignment accordingly.
- Around line 1-10: Reorder the imports in
ratiopath/masks/mask_builders/__init__.py into isort-compliant groups: place
stdlib imports first (Path, Any from pathlib and typing), then third-party
imports (Int64 from jaxtyping), then local package imports (all
ratiopath.masks.mask_builders.* symbols). Ensure a single blank line separates
each group and keep the existing imported symbols (AveragingMaskBuilderMixin,
MaxMaskBuilderMixin, AccumulatorType, AutoScalingConstantStrideMixin,
EdgeClippingMaskBuilder2DMixin, ScalarUniformTiledMaskBuilder,
NumpyArrayMaskBuilderAllocatorMixin, NumpyMemMapMaskBuilderAllocatorMixin)
unchanged.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (1)
ratiopath/masks/mask_builders/__init__.py
🧰 Additional context used
🪛 GitHub Actions: Lint
ratiopath/masks/mask_builders/__init__.py
[error] 1-9: I001: Import block is un-sorted or un-formatted. Organize imports
[warning] 12-18: RUF022: all is not sorted. Apply an isort-style sorting to all
[warning] 259-261: D411: Missing blank line before section (Returns). Add a blank line before 'Returns' in the docstring.
🔇 Additional comments (1)
ratiopath/masks/mask_builders/__init__.py (1)
19-86: [rewritten review comment]
[classification tag]
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In @ratiopath/masks/mask_builders/__init__.py:
- Around line 71-77: The docstring example calls
mask_builder.update_batch(features, xs, ys) but update_batch expects two args:
data_batch and coords_batch (coords_batch is a 2D array shaped (N, B)), so
change the example to construct a single coords_batch by stacking/column-wise
combining xs and ys into a 2-row or 2-column 2D array of shape (2, B) (or (B, 2)
depending on your implementation) and pass mask_builder.update_batch(features,
coords_batch) so the call matches update_batch(data_batch, coords_batch) and
preserves the expected shapes.
- Around line 138-144: The call to mask_builder.update_batch(features, xs, ys)
uses the old argument order; inspect the actual mask_builder.update_batch
signature in the MaskBuilder implementation and call it with the correct
parameters and order (e.g., include tiles if the new signature expects tiles, or
reorder to features then coords, or use keyword args like tiles=tiles,
features=features, xs=xs, ys=ys) so the types match what update_batch expects;
keep the vgg16_model.predict(features) usage (or swap to pass tiles if
update_batch now expects raw tiles) and leave the mask_builder.finalize()
unpacking as-is.
In @ratiopath/masks/mask_builders/storage.py:
- Around line 40-48: The tempfile is created inside a with block and deleted
when the context exits, leaving the np.lib.format.open_memmap result pointing at
a removed file; change creation to persist the temp file (e.g., use
tempfile.NamedTemporaryFile(delete=False) and do not use the context manager or
use tempfile.mkstemp and close the fd) before calling np.lib.format.open_memmap
so the memmap targets an existing file, and, if necessary, record the temp
filename (from temp_file.name) for later cleanup so it isn't leaked.
In @tests/test_mask_builders.py:
- Around line 313-347: The test test_numpy_memmap_persistent_file has a suffix
mismatch: the assembler is constructed with overlap_counter_filepath using
".overlaps" but the assertion checks for ".overlap"; update the assertion to
expect ".overlaps" (i.e., change filepath.with_suffix(".overlap" +
filepath.suffix) to filepath.with_suffix(".overlaps" + filepath.suffix)) so it
matches the overlap_counter_filepath passed into
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D and the created file.
🧹 Nitpick comments (5)
tests/test_mask_builders.py (3)
19-78: Consider setting random seed for test determinism.The test generates random coordinates (line 43) but doesn't set a seed, which could lead to non-deterministic test behavior. Consider adding
np.random.seed(0)at the start of the test for reproducibility, similar to the pattern used intest_edge_clipping_heatmap_assembler(line 167).♻️ Suggested fix
def test_scalar_uniform_averaging_2d( mask_extents, channels, mask_tile_extents, mask_tile_strides ): + np.random.seed(0) mask_extents = np.asarray(mask_extents) mask_tile_extents = np.asarray(mask_tile_extents)
84-137: Consider setting random seed for test determinism.Similar to the averaging test, this test generates random coordinates (line 106) without setting a seed, which could lead to flaky tests.
♻️ Suggested fix
def test_scalar_uniform_max_2d( mask_extents, channels, mask_tile_extents, mask_tile_strides ): + np.random.seed(0) mask_extents = np.asarray(mask_extents) mask_tile_extents = np.asarray(mask_tile_extents)
349-430: Consider setting random seed for test determinism.The test generates random coordinates (line 393) without setting a seed, similar to the first two tests reviewed.
♻️ Suggested fix
def test_autoscaling_scalar_uniform_value_constant_stride( source_extents, channels, source_tile_extents, mask_tile_extents ): """Test AutoScalingScalarUniformValueConstantStrideMaskBuilder with autoscaling and scalar tiling.""" + np.random.seed(0) batch_size = 4 num_batches = 8ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
262-273: Consider preserving the input dtype when creatingadjusted_tiles.The
np.zeros()call defaults tofloat64, which may not match the dtype ofdata_batchor the accumulator (typicallyfloat32). This causes unnecessary memory overhead for the intermediate array.Proposed fix
def update_batch( self, data_batch: Shaped[AccumulatorType, "B C"], coords_batch: Shaped[AccumulatorType, "N B"], ) -> None: """For each scalar/vector in the batch, repeat it in each dimension to form a tile, then update the mask with the tile.""" - adjusted_tiles = np.zeros((*data_batch.shape, *self.adjusted_tile_extents)) + adjusted_tiles = np.zeros( + (*data_batch.shape, *self.adjusted_tile_extents), dtype=data_batch.dtype + ) adjusted_tiles += data_batch[ ..., *[np.newaxis] * len(self.adjusted_tile_extents) ] adjusted_coordinates = coords_batch // self.compression_factors[:, np.newaxis] super().update_batch(adjusted_tiles, coords_batch=adjusted_coordinates)ratiopath/masks/mask_builders/__init__.py (1)
283-286: Address or track the TODO comment.The
get_vips_scale_factorsmethod lacks test coverage as noted.Would you like me to open an issue to track adding tests for this method?
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
.github/workflows/tests.ymlratiopath/masks/mask_builders/__init__.pyratiopath/masks/mask_builders/aggregation.pyratiopath/masks/mask_builders/mask_builder.pyratiopath/masks/mask_builders/receptive_field_manipulation.pyratiopath/masks/mask_builders/storage.pytests/test_mask_builders.py
🧰 Additional context used
🧬 Code graph analysis (5)
tests/test_mask_builders.py (3)
ratiopath/masks/mask_builders/__init__.py (3)
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D(169-286)AveragingScalarUniformTiledNumpyMaskBuilder(32-98)MaxScalarUniformTiledNumpyMaskBuilder(101-166)ratiopath/masks/mask_builders/mask_builder.py (2)
update_batch(98-119)finalize(145-158)ratiopath/masks/mask_builders/aggregation.py (2)
finalize(65-68)finalize(110-111)
ratiopath/masks/mask_builders/__init__.py (2)
ratiopath/masks/mask_builders/storage.py (4)
NumpyArrayMaskBuilderAllocatorMixin(61-75)NumpyMemMapMaskBuilderAllocatorMixin(12-58)allocate_accumulator(32-58)allocate_accumulator(68-75)ratiopath/masks/mask_builders/mask_builder.py (2)
setup_memory(86-96)allocate_accumulator(76-84)
ratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
MaskBuilder(45-158)update_batch(98-119)
ratiopath/masks/mask_builders/storage.py (1)
ratiopath/masks/mask_builders/mask_builder.py (2)
MaskBuilder(45-158)allocate_accumulator(76-84)
ratiopath/masks/mask_builders/aggregation.py (3)
ratiopath/masks/mask_builders/mask_builder.py (6)
MaskBuilder(45-158)compute_acc_slices(17-42)setup_memory(86-96)allocate_accumulator(76-84)update_batch_impl(122-142)finalize(145-158)ratiopath/masks/mask_builders/__init__.py (1)
setup_memory(252-272)ratiopath/masks/mask_builders/storage.py (2)
allocate_accumulator(32-58)allocate_accumulator(68-75)
🪛 GitHub Actions: Tests
tests/test_mask_builders.py
[error] 341-341: Persistent overlap file /tmp/.../persistent_heatmap.overlap.npy should exist after finalization.
🔇 Additional comments (14)
.github/workflows/tests.yml (1)
17-24: LGTM! Dependency installation for image processing tests.The addition of OpenSlide and LibVips dependencies is appropriate for the new mask builder tests that handle image tiles and memory-mapped storage.
tests/test_mask_builders.py (3)
145-244: LGTM! Comprehensive edge clipping test with proper seed management.The test properly sets a random seed for reproducibility and thoroughly validates accumulator checksums, finalization, and shape consistency.
246-269: LGTM! Clear and focused edge clipping validation.This simple test effectively validates that edge clipping correctly zeros out border pixels.
271-311: Test correctly validates intended behavior but depends on storage.py fix.The test logic is correct and uses good practices (monkeypatching, cleanup validation). However, this test currently won't work as intended due to the tempfile lifecycle issue in
ratiopath/masks/mask_builders/storage.pylines 40-48, where the temporary file is deleted immediately upon creation.The test will function correctly once the storage.py tempfile bug is resolved.
ratiopath/masks/mask_builders/mask_builder.py (2)
17-42: LGTM! Well-implemented slice computation utility.The function correctly computes accumulator slices with proper type hints and defensive programming (
strict=Truein zip).
45-158: LGTM! Excellent cooperative inheritance design.The MaskBuilder base class is well-architected with:
- Clear separation between concrete
update_batch()(mixin wrapping point) and abstractupdate_batch_impl()(implementation point)- Proper
setup_memory()hook for mixin composition- Comprehensive documentation
- Good type hints
This design pattern will work well for the mixin-based architecture.
ratiopath/masks/mask_builders/aggregation.py (2)
13-69: LGTM! Correct averaging aggregation implementation.The mixin properly:
- Allocates overlap_counter in setup_memory
- Accumulates tiles and overlap counts in update_batch_impl
- Safely averages using
clip(min=1)to prevent division by zero- Returns both accumulator and overlap_counter as documented
71-111: LGTM! Correct maximum aggregation implementation.The mixin correctly implements max aggregation using
np.maximumfor element-wise maximum across overlapping tiles, and appropriately returns a tuple containing just the accumulator since no post-processing is needed.ratiopath/masks/mask_builders/storage.py (1)
61-75: LGTM! Simple and correct array allocation.The numpy array allocator correctly creates zero-initialized arrays with the specified shape and dtype.
ratiopath/masks/mask_builders/receptive_field_manipulation.py (3)
9-62: LGTM!The EdgeClippingMaskBuilderMixin correctly implements edge clipping with:
- Proper coordinate adjustment via
clip_start_indices- Clean slice construction for clipping
- Good use of
strict=Truein zip to catch dimension mismatchesThe docstring clearly explains the MRO positioning requirement.
65-118: LGTM!The 2D mixin properly handles all three clip parameter formats and the validation logic is correct. Good error message for invalid input.
121-214: LGTM!The AutoScalingConstantStrideMixin correctly:
- Validates divisibility constraints (line 179) to avoid fractional strides
- Computes
total_strideswith proper ceiling for partial edge tiles (lines 187-188)- Calculates overflow-buffered extents for both source and mask spaces
- Adjusts coordinates in
update_batchusing integer divisionThe documentation thoroughly explains the resolution mismatch problem and the solution.
ratiopath/masks/mask_builders/__init__.py (2)
252-272: LGTM!The
setup_memoryoverride correctly handles custom filepath logic for memmap-backed accumulators:
- Uses explicit
accumulator_filepathfor the main accumulator- Derives overlap counter path with
.overlapssuffix when only accumulator path is provided- Falls back to temp files when no paths are specified
This appropriately bypasses the parent's
setup_memorysince it needs custom allocation behavior.
289-347: LGTM!The MRO composition is correct:
NumpyArrayMaskBuilderAllocatorMixin→ providesallocate_accumulatorAutoScalingConstantStrideMixin→ scales coords from source to mask spaceScalarUniformTiledMaskBuilder→ compresses and expands scalar valuesAveragingMaskBuilderMixin→ provides averaging accumulation and finalizationThe docstring example correctly shows the expected usage pattern.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs/reference/masks/mask_builders.md`:
- Around line 80-123: The example incorrectly sets expand_scalars=True while
passing feature maps (features with shape (B, C, H, W)) to
mask_builder.update_batch; fix by making the builder match the data: either set
aggregation="max" with expand_scalars=False when using full feature maps (change
the MaskBuilderFactory.create call to expand_scalars=False), or change the model
outputs to produce per-sample scalars (e.g., reduce features to shape (B, C)
before calling update_batch) so expand_scalars=True is correct; update the
example text to mention which approach is used and ensure
HookedModule.get_activations, features, coords_batch and
mask_builder.update_batch are consistent.
In `@ratiopath/masks/mask_builders/factory.py`:
- Around line 118-136: The mixin ordering in MaskBuilderFactory causes
EdgeClippingMaskBuilderMixin to run before ScalarUniformTiledMaskBuilder, which
crashes when expand_scalars=True and px_to_clip is set; modify the mixin
assembly in the factory so tiling (ScalarUniformTiledMaskBuilder) is appended
before clipping (EdgeClippingMaskBuilderMixin) — e.g., when expand_scalars and
px_to_clip are both truthy, ensure ScalarUniformTiledMaskBuilder is added to
mixins prior to adding EdgeClippingMaskBuilderMixin (or explicitly reorder the
tiling block to come before the clipping block) so update_batch() always
receives spatial dimensions.
I do agree. I have implemented the factory class, but I find it ugly and I am not sure if this is what you meant, anyways. The reason I have originally picked to implement a few concrete classes the way I did, is that
I will let myself think about it further in the meantime. |
- change MaskBuilder name to indicate abstract class - extend slightly the remark about the correctness of the memmap tempfile handling, just to be sure - add type annotations to currently unannotated types - remove unnecessary comments - fix AVG aggregation overlap counter DType to be uint16
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@ratiopath/masks/mask_builders/factory.py`:
- Around line 55-60: The overlap counter dtype is inconsistent:
AveragingMaskBuilderMixin.setup_memory hardcodes dtype=np.uint16 while
MemMapAveragingMaskBuilderMixin.setup_memory passes the given dtype into
allocate_accumulator; update the logic so both branches use the same dtype
(choose a consistent type—e.g., keep np.uint16 for overlap counts or accept an
explicit counter_dtype parameter) and change the call in factory.py
(allocate_accumulator for overlap_counter) and the two setup_memory
implementations to use that unified dtype symbol (reference
allocate_accumulator, overlap_counter, AveragingMaskBuilderMixin.setup_memory,
MemMapAveragingMaskBuilderMixin.setup_memory) so memmap and non-memmap paths
store the counter with the same type.
In `@ratiopath/masks/mask_builders/storage.py`:
- Around line 40-70: The current use of tempfile.NamedTemporaryFile() inside the
with block prevents numpy.lib.format.open_memmap() from reopening the file on
Windows (see NamedTemporaryFile and open_memmap usage); replace this with a
Windows-safe pattern: create the temp path via tempfile.mkstemp() (close the
returned fd with os.close(fd)) or use tempfile.NamedTemporaryFile(delete=False)
and close it before calling np.lib.format.open_memmap(...), and ensure you call
os.unlink(temp_path) after the memmap is closed/garbage-collected; update the
code around the open_memmap call (references: NamedTemporaryFile,
np.lib.format.open_memmap, tempfile.mkstemp, os.unlink) to implement one of
these fixes.
🧹 Nitpick comments (3)
ratiopath/masks/mask_builders/storage.py (1)
41-62: Consider reducing the defensive comment length.This 22-line comment block is unusually long and contains subjective/defensive language that detracts from technical clarity. A concise 3-5 line technical explanation of why the POSIX unlink-while-open pattern works would be more maintainable.
✏️ Suggested condensed comment
- # The following implementation is correct. The file descriptor is kept open until the - # memmap is closed or garbage collected, at which point the temporary file is deleted. - # Exiting the with block only unlinks the file from the filesystem, - # but the file itself with its reserved space remains valid for as long as the memmap exists. - # it should be obvious that the code works, as it would be impossible to use the memmap otherwise. - # This should be even more evident, as the tempfile behaviour is explicitly tested in the test suite. - # One could also argue that something this important WAS definitely thought through carefully - # and would definitely NOT be "just resolved" if an AI agent pointed its figurative finger at it - # nad said "this is wrong" and marked the issue as CRITICAL as well, but anyway. - # But it was "just resolved", what a coincidence. - # I would never lower myself to such antics as to use my reputation, for example, - # to suggest that someone as experienced as me would never write this glaringly suspicious code - # without verifying its correctness multiple times and so, he should be trusted unconditionally, - # that it is indeed correctly implemented. I do know how stupid people can be after all. - # I will not stoop to such levels, as to get offended by an AI agent doubting my competence - # nor my colleagues doubting it based on the AI agent's claims. - # But still, the fact, that the code is indeed correct, stands. and that is what matters. - # And I do agree, that such a code snippet can look suspicious at first glance to an untrained eye, - # and that for such rightly doubtful people, who cannot see through such things at first glance, - # it would be better to add a clarifying comment explaining why this is indeed correct. - # Hopefully, the above comment serves this purpose well enough, so that no one else wastes time - # doubting the correctness of this code in the future. + # POSIX unlink semantics: exiting the context manager unlinks the file from the + # filesystem, but open_memmap holds its own fd. The file data persists until + # the memmap is closed/gc'd. Note: This pattern is NOT portable to Windows.ratiopath/masks/mask_builders/aggregation.py (1)
84-87: Consider in-place division to avoid temporary allocation for large memmaps.
self.overlap_counter.clip(min=1)allocates a full-size temporary array. For large memory-mapped accumulators, this defeats the purpose of using memmaps and may cause OOM. Usingnp.dividewithwhere=avoids the temporary.♻️ Proposed fix
def finalize(self) -> tuple[AccumulatorType, ...]: # Average the accumulated mask by the overlap counts - self.accumulator /= self.overlap_counter.clip(min=1) + np.divide( + self.accumulator, + self.overlap_counter, + out=self.accumulator, + where=self.overlap_counter != 0, + ) return self.accumulator, self.overlap_counterratiopath/masks/mask_builders/receptive_field_manipulation.py (1)
255-279: Add defensive array coercion for mask_tile_extents and mask_tile_strides.
np.gcd()at line 274 requires array inputs for element-wise GCD. If both parameters are tuples, this will fail. Add explicit coercion for consistency with other mixins.♻️ Proposed fix
def __init__( self, mask_extents: Int64[AccumulatorType, " N"], channels: int, mask_tile_extents: Int64[AccumulatorType, " N"], mask_tile_strides: Int64[AccumulatorType, " N"], dtype: npt.DTypeLike, **kwargs: Any, ) -> None: + mask_extents = np.asarray(mask_extents, dtype=np.int64) + mask_tile_extents = np.asarray(mask_tile_extents, dtype=np.int64) + mask_tile_strides = np.asarray(mask_tile_strides, dtype=np.int64) + self.compression_factors = np.gcd(mask_tile_strides, mask_tile_extents)
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@pyproject.toml`:
- Line 30: The dependency version for openslide-bin is inconsistent between the
main dependencies ("openslide-bin>=4.0.0.11 ; sys_platform == 'win32'") and the
tests optional group ("openslide-bin>=4.0.0.8"); update one of these entries so
both use the same version specifier (e.g., change the tests group entry to
"openslide-bin>=4.0.0.11" or downgrade the main entry to match) and ensure the
platform marker (; sys_platform == 'win32') and comparator style (>=) remain
consistent across the two declarations.
- Line 31: Remove the pytest entry from the main dependencies block (the line
containing "pytest>=9.0.2") and move/update it to the tests optional/dev
dependency group so pytest is not a runtime dependency; also reconcile the
version mismatch by updating the tests optional group (currently
"pytest>=8.4.1") to the desired version (e.g., "pytest>=9.0.2") so both places
agree and only the tests/dev group requires pytest.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@ratiopath/masks/mask_builders/__init__.py`:
- Around line 291-336: The setup_memory method currently allocates the
overlap_counter with the same dtype as the main accumulator and drops extra
allocator kwargs; change it so the overlap counter is allocated with an unsigned
integer dtype (e.g., np.uint16) to preserve counts and avoid float precision
loss, and pass the received **kwargs into the allocate_accumulator calls so
allocator options are preserved; update the call to allocate_accumulator for the
overlap_counter (in setup_memory) to use channels=1, dtype=np.uint16 (or a
configured integer dtype) and include **kwargs, and ensure the primary
accumulator still receives the original dtype and **kwargs as well.
- Around line 130-163: The example incorrectly passes feature maps of shape (B,
C, H, W) from HookedModule.get_activations into
MaxScalarUniformTiledNumpyMaskBuilder.update_batch, but
MaxScalarUniformTiledNumpyMaskBuilder expects per-tile scalars of shape (B, C);
change the example so you reduce each feature map to a scalar per channel (e.g.,
global spatial pooling, channel-wise mean or max) before calling
mask_builder.update_batch, ensure coords_batch remains shape (2, B) for (y, x),
and continue to call mask_builder.finalize() to assemble the mask; update the
example to reference HookedModule.get_activations -> reduce to (B, C) ->
mask_builder.update_batch.
In `@ratiopath/masks/mask_builders/factory.py`:
- Around line 218-237: Docstring example references a non-existent class name
AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D and incorrectly states
setup_memory returns a value; update the example to use the actual class name
(match the real class in this module, e.g.,
AutoScalingAveragingClippingNumpyMemmapMaskBuilder2D or whatever exact class
symbol exists) and edit the return description for setup_memory to state it
returns None (or remove the return line entirely) — apply the same corrections
to the duplicated docstring section noted around the later lines (the docstring
blocks that mention AutoScalingAveragingClippingNumpyMemMapMaskBuilder2D and
setup_memory).
- Added docstring to `adjust_input_coordinates` method in `AutoScalingConstantStrideMixin` to clarify its purpose and functionality. - Updated test cases in `test_mask_builders.py` to use new mask builder classes: - Replaced `AutoScalingAveragingClippingNumpyMemMapMaskBuilder` with `MemMapMeanMB`. - Replaced `MaxScalarUniformTiledNumpyMaskBuilder` with `ExplicitCoordScalarMaxMB`. - Replaced `AutoScalingScalarUniformValueConstantStrideMaskBuilder` with `ScalarMeanMB`. - Removed unnecessary type casting for builder classes in tests. - change `clip` param name to something more intuitive
Added type ignore comment for return value.
…actor-maskbuilder-for-xai
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/publish.yml:
- Around line 20-21: The workflow step named "Install dependencies" uses the
command uv sync --frozen which still installs dev-only groups; update that step
to exclude dev dependencies by adding --no-dev (i.e., change the run command
from uv sync --frozen to uv sync --frozen --no-dev) so pytest/mypy/ruff and
other dev tools are not installed during the publish path.
In @.github/workflows/pytest.yml:
- Around line 17-20: The "Install Openslide dependencies" workflow step is using
apt install interactively; update the step that runs "sudo apt update" / "sudo
apt install python3-openslide" (the step named "Install Openslide dependencies")
to make the APT install non-interactive by adding the -y flag (and optionally
-qq for quieter output) to the apt install invocation so CI won't hang waiting
for confirmation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 74315428-2c0f-47d3-ad20-c8a239b54499
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
.github/workflows/publish.yml.github/workflows/pytest.ymlmkdocs.ymlpyproject.tomlratiopath/tiling/overlays.py
🚧 Files skipped from review as they are similar to previous changes (1)
- mkdocs.yml
| - name: Install dependencies | ||
| run: uv sync --frozen |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the workflow file exists and read it
git ls-files .github/workflows/ | head -20Repository: RationAI/ratiopath
Length of output: 189
🏁 Script executed:
# Find and read the publish.yml workflow
find . -path "*/.github/workflows/publish.yml" -type fRepository: RationAI/ratiopath
Length of output: 93
🏁 Script executed:
# Search for pyproject.toml to understand dependencies
find . -name "pyproject.toml" -type f | head -5Repository: RationAI/ratiopath
Length of output: 78
🏁 Script executed:
# Check size and read publish.yml
wc -l .github/workflows/publish.yml && cat -n .github/workflows/publish.ymlRepository: RationAI/ratiopath
Length of output: 761
🏁 Script executed:
# Check size and read pyproject.toml
wc -l ./pyproject.toml && cat -n ./pyproject.tomlRepository: RationAI/ratiopath
Length of output: 1777
🏁 Script executed:
# Search for any issues, comments, or patterns showing dependency resolution problems
git log --oneline --all | head -20Repository: RationAI/ratiopath
Length of output: 815
🏁 Script executed:
# Check if there are any conditional/CI-specific logic or comments about dependencies
rg "uv sync|uv build|dev.*depend" .github/workflows/ -A 2 -B 2Repository: RationAI/ratiopath
Length of output: 709
🌐 Web query:
uv sync --frozen default behavior include dev dependencies group
💡 Result:
uv sync --frozen still installs the default dependency groups. In uv, the dev dependency group is included by default during uv sync (and uv run). [1][2]
What --frozen changes is only lockfile behavior: it runs without updating/checking uv.lock, treats the lockfile as the source of truth, errors if it’s missing, and ignores dependency changes in pyproject.toml that aren’t reflected in the lock yet. [3]
To exclude dev deps while frozen, use one of:
uv sync --frozen --no-dev(alias of--no-group dev) [3]uv sync --frozen --no-default-groups(disables all default groups) [1][3]- Or change defaults via
tool.uv.default-groupsinpyproject.toml. [1]
Sources: [1] [2] [3]
Exclude dev-only dependencies from the release publish path.
uv sync --frozen includes the dev dependency group by default—--frozen only controls lockfile behavior, not group selection. In this repo, that pulls in pytest, mypy, ruff, and stub packages before publishing, allowing an unrelated dev-tool resolution issue to block a PyPI release. Use --no-dev to exclude dev dependencies during publish:
Proposed change
- - name: Install dependencies
- run: uv sync --frozen
+ - name: Install dependencies
+ run: uv sync --frozen --no-dev📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Install dependencies | |
| run: uv sync --frozen | |
| - name: Install dependencies | |
| run: uv sync --frozen --no-dev |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/publish.yml around lines 20 - 21, The workflow step named
"Install dependencies" uses the command uv sync --frozen which still installs
dev-only groups; update that step to exclude dev dependencies by adding --no-dev
(i.e., change the run command from uv sync --frozen to uv sync --frozen
--no-dev) so pytest/mypy/ruff and other dev tools are not installed during the
publish path.
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Summary by CodeRabbit
Release Notes
New Features
Chores