Skip to content

feat: Add regrid functionality based on match geometry#1597

Open
eliascapriles-NOAA wants to merge 29 commits intoOSOceanAcoustics:mainfrom
eliascapriles-NOAA:regrid
Open

feat: Add regrid functionality based on match geometry#1597
eliascapriles-NOAA wants to merge 29 commits intoOSOceanAcoustics:mainfrom
eliascapriles-NOAA:regrid

Conversation

@eliascapriles-NOAA
Copy link
Copy Markdown

Implements regrid_all_channel function that would allow for users to match the sampling rates of a channels in a ds_Sv dataset to a specific channel

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 74.60317% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.63%. Comparing base (6cf6cee) to head (cd50d19).
⚠️ Report is 27 commits behind head on main.

Files with missing lines Patch % Lines
echopype/commongrid/utils.py 70.58% 20 Missing ⚠️
echopype/commongrid/api.py 78.94% 12 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (6cf6cee) and HEAD (cd50d19). Click for more details.

HEAD has 6 uploads less than BASE
Flag BASE (6cf6cee) HEAD (cd50d19)
integration 3 0
unittests 3 0
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1597       +/-   ##
===========================================
- Coverage   85.58%   60.63%   -24.95%     
===========================================
  Files          79       79               
  Lines        6998     7139      +141     
===========================================
- Hits         5989     4329     -1660     
- Misses       1009     2810     +1801     
Flag Coverage Δ
integration ?
unit 60.63% <74.60%> (+0.22%) ⬆️
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@leewujung
Copy link
Copy Markdown
Member

Hey @eliascapriles-NOAA : Thanks for the PR! Below are some comments based on quick look:

  • Could you add a few tests for this function? Both unit test using mock data and integration data using a small segment real data would be very useful.
  • commingrid/utils.py is meant for util functions used by those in commongrid/api.py, so functions users interact with will be in api.py. What do you think about below?
    • rename regrid_all_channels to regrid in api.py
    • change the argument target_channel_idx to target_channel and require users to put in the channel_id (it's more robust to index with name compared to sequential indices)
    • expand the functionality to include not only specifying a target channel (what you already have) but also allowing specifying a set of target grid? So the function signature would become something like:
    # only one of target_channel or target_grid can be used -- so raise an error if both are provided
    regrid(ds_Sv, target_channel="CHANNEL_ID", target_grid="TARGET_GRID")

@eliascapriles-NOAA
Copy link
Copy Markdown
Author

Sounds good. I will get started on implementing the changes !

@eliascapriles-NOAA
Copy link
Copy Markdown
Author

Hi Wu-Jung sorry for the delay ! The tests uncovered a couple of bugs that I wanted to fix before submitting for a new PR. I have added unit tests for my helper function, and integration test using SV data for the regridding function. Let me know if there are any changes !

@leewujung leewujung changed the title "Add regrid functionality based on match geometry" feat: Add regrid functionality based on match geometry Feb 3, 2026
@leewujung
Copy link
Copy Markdown
Member

Thanks @eliascapriles-NOAA ! I'm going to ask @LOCEANlloydizard to help review this since you had most of the discussions with him.

@LOCEANlloydizard - Could you please take a look? feel free to ping me for discussions. Thanks!

@eliascapriles-NOAA
Copy link
Copy Markdown
Author

eliascapriles-NOAA commented Feb 3, 2026

Sounds good ! Thanks @leewujung

Copy link
Copy Markdown
Collaborator

@LOCEANlloydizard LOCEANlloydizard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @eliascapriles-NOAA, thx for the PR!
Following our discussion, a small recap (you probably noted more!):

  • the errors from CI are now gone with the merge of pandas < 3, just need to address the assertionError
  • remove the +20 padding at the end of the function
  • double-check that pings are actually aligned before regridding (there’s already a helper in echopype that does this, see getting_started notebook)
  • update the notebook to call the function from your PR, so I can run it on my side too
  • we can go from there for other points !

One thing I wanted to flag more generally: right now the regridding is done by looping over channel in Python, and inside that running an apply_ufunc that vectorises over ping_time × range_sample.

This is more a design question than a bug: are we happy with looping over channels in Python and vectorising over pings with apply_ufunc, or should we try to make the whole regridding run as one channel-aware vectorised operation? I know you looked into it.., and i could have a look as well! and maybe @leewujung would have recommendations for this?

Cheers!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error "TypeError: only integer scalar arrays can be converted to a scalar index" is due to pandas update to version 3.0. Could you merge the latest version of the main echopype repo (pandas is pinned there!)

@leewujung
Copy link
Copy Markdown
Member

Hey @eliascapriles-NOAA @LOCEANlloydizard : Not sure if I am really making a good suggestion, but since apply_ufunc is already configured to be run in parallel, it seems to me that it would be useful to have all channels to be processed at once (instead of in a loop). My thought is then this way the parallelization component is delegated to the xarray-dask combination, rather than us making specific choices ourselves (unless we know for sure that it is better). Let me know what you think!

@eliascapriles-NOAA
Copy link
Copy Markdown
Author

Hi @leewujung LLoyd brought this up yesterday. The reason I currently have the channels in a loop is because my apply_ufunc is parallelizing the function across ping_time as specificed by the Echoview algorithm. However, I will try to rework my function to parallelize across the channel dimension as well

Fixture is moved to the correct spot
Added a check to make sure that only Sv values will be converted from log
Fixed test to fit with new structure
@leewujung leewujung moved this to In Progress in Echopype 2026 Apr 2, 2026
@leewujung leewujung moved this from In Progress to In Review in Echopype 2026 Apr 2, 2026
@github-project-automation github-project-automation Bot moved this from In Review to Done in Echopype 2026 Apr 3, 2026
Comment thread echopype/commongrid/utils.py Outdated
Returns
-------
deepest_ping: int
index of the "deepest" sample
Copy link
Copy Markdown
Collaborator

@LOCEANlloydizard LOCEANlloydizard Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like

"Index of the ping containing the deepest valid range sample."

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

deepest_ping: int
Index of the ping containing with most amount of data before trailing NaN values.

Comment thread echopype/commongrid/api.py Outdated
Comment thread echopype/commongrid/utils.py Outdated
Comment thread echopype/commongrid/utils.py Outdated
Comment thread echopype/commongrid/utils.py
Comment thread echopype/commongrid/utils.py Outdated
Comment thread echopype/commongrid/api.py Outdated
Comment thread echopype/commongrid/api.py Outdated
Comment thread echopype/commongrid/api.py Outdated
Comment thread echopype/commongrid/api.py Outdated
Comment thread echopype/commongrid/api.py Outdated
Comment thread echopype/commongrid/api.py Outdated
target_range_da = target_grid.copy()

deepest_ping_index = get_valid_max_depth_ping(ds_Sv, target_grid=target_grid)
valid_range_sample = np.argmax(target_grid.isel(ping_time=deepest_ping_index).values)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel here we could be more concise if 1) we don't trim ? (to discuss) and no need for copy()?

something like:

if target_channel is not None:
    target_range_da = ds_Sv["echo_range"].sel(channel=target_channel)
else:
    target_range_da = target_grid

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for trimming was to originally save memory and have a visual way to notice that the regridding initially worked by printing the dimension size. However, the helper function masks each nan ping to only take into account the pings. I am going to test the output without the trimming to make sure everything still works, and get back to you.

I agree that copy does not contribute anything. I will delete them.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After testing the regrid without any trimming everything still works and regrid simply. @leewujung and @LOCEANlloydizard which of these approaches do you think better reflects echopype's current functions.

Comment thread echopype/commongrid/api.py Outdated
Comment thread echopype/commongrid/api.py Outdated
Comment thread echopype/tests/commongrid/test_commongrid_api.py Outdated
Comment thread echopype/tests/commongrid/test_commongrid_api.py Outdated
Comment thread echopype/tests/commongrid/test_commongrid_api.py Outdated
@LOCEANlloydizard
Copy link
Copy Markdown
Collaborator

Hey @eliascapriles-NOAA, thanks for the modifications! I’ve left a few comments above, and also added some broader thoughts below to check first =>

  1. Naming: maybe resample would be clearer than regrid, which feels a bit generic? thoughts?

  2. Do we actually want the trimming step after the resampling? forgot if we discussed this

  • If not, we could remove this entirely
  • If yes, we could likely still replace get_valid_max_depth_ping with a xarray based approach?
  1. Output dataset:

we now returns a new xarray object which is nice, but we might want to align it more with other regridding like compute_MVBS?

-propagate key variables:
-- water_level (to allow add_depth() downstream)
-- frequency_nominal
-- position variables (lat/lon, etc)

-add provenance:
-- processing_function = "commongrid.regrid" (or resample)?

-add variable-level attrs on Sv:
-- resampling_mode ("target_channel" / "target_grid")
-- target_channel
-- short description?

  1. .copy() usage: here (and other places, see comments)
ds_target = ds_Sv.sel(...).copy()
target_range_da = target_grid.copy()

does not seem necessary here since nothing is mutated afterwards? (but could be missing the obvious!)

  1. Some small fixes in the tests
  • in some places there's a mix of positional indexing and label-based selection? It could be better to consistently use label selection?
  • Maybe we can compare the target channel element by element before and after regridding onto its own geometry? should be the same
  • If we add water_level, frequency_nominal, lat/lon, attrs, or processing_function, we’ll probably want explicit tests for metadata/provenance too?
  • Maybe it would be worth adding a test for cases where target_grid has NaNs/partially valid geometry? just to check the expected behaviour?

It's the final adjusments! Thank you very much for this!
Cheers!

@LOCEANlloydizard LOCEANlloydizard moved this from Done to In Review in Echopype 2026 Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

4 participants