Mask boundary nodes in spatial loss maps during test_step#568
Mask boundary nodes in spatial loss maps during test_step#568RajdeepKushwaha5 wants to merge 2 commits intomllam:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes an inconsistency in ARModel test-time spatial loss map generation by ensuring boundary grid nodes are excluded (as they already are in the optimized training/validation losses), so saved/visualized spatial loss maps reflect the same interior-only objective.
Changes:
- Mask boundary nodes in
test_stepspatial loss maps by setting boundary entries toNaNwhile preserving the full-grid shape for plotting. - Switch aggregation in
on_test_epoch_endfromtorch.meantotorch.nanmeanso boundary-nodeNaNs are ignored when computing the mean spatial loss map. - Add a corresponding entry to the unreleased
CHANGELOG.md.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
neural_lam/models/ar_model.py |
Masks boundary nodes in test spatial loss maps and uses nanmean when aggregating, aligning evaluation maps with interior-masked losses. |
CHANGELOG.md |
Documents the bug fix in the unreleased “Fixed” section. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ### Fixed | ||
|
|
||
| - Mask boundary nodes in spatial loss maps computed during `test_step`, consistent with all other loss calls that use `interior_mask_bool` [\#TBD](https://github.com/mllam/neural-lam/pull/TBD) @RajdeepKushwaha5 |
There was a problem hiding this comment.
The CHANGELOG entry still contains a placeholder PR reference ("#TBD" and a "/pull/TBD" URL). Please replace this with the actual PR number (and/or the real issue link) before merging so the link is valid in release notes.
| - Mask boundary nodes in spatial loss maps computed during `test_step`, consistent with all other loss calls that use `interior_mask_bool` [\#TBD](https://github.com/mllam/neural-lam/pull/TBD) @RajdeepKushwaha5 | |
| - Mask boundary nodes in spatial loss maps computed during `test_step`, consistent with all other loss calls that use `interior_mask_bool` @RajdeepKushwaha5 |
| spatial_loss = self.loss( | ||
| prediction, target, pred_std, average_grid=False | ||
| ) # (B, pred_steps, num_grid_nodes) | ||
| # Exclude boundary nodes, consistent with training/validation loss | ||
| spatial_loss[..., ~self.interior_mask_bool] = float("nan") |
There was a problem hiding this comment.
Test coverage: this change introduces NaN-masking for boundary nodes and switches the aggregation to nanmean, but there doesn’t appear to be a unit/integration test asserting that boundary nodes are excluded from the saved/plot-ready spatial loss maps. Adding a small test (e.g., with a minimal module/mocked masks similar to existing ARModel tests) would prevent regressions in evaluation outputs.
The spatial_loss call in test_step did not pass mask=self.interior_mask_bool, unlike every other loss call in training_step, validation_step, and test_step. This caused boundary grid nodes to be included in spatial loss maps and the saved mean_spatial_loss.pt, inconsistent with the loss the model optimises. Set boundary node values to NaN after computing the full-grid loss (to keep the num_grid_nodes shape needed by plot_spatial_error) and switch to torch.nanmean when averaging over samples in on_test_epoch_end. Co-authored-by: GitHub Copilot
bd5464c to
0c7fe9a
Compare
Co-authored-by: GitHub Copilot
Describe your changes
The
spatial_losscall intest_stepdid not passmask=self.interior_mask_bool, unlike every otherself.loss()call intraining_step,validation_step, andtest_stepitself. This caused boundary grid nodes to be included in the spatial loss maps and the savedmean_spatial_loss.pt, inconsistent with the loss the model optimises.Set boundary node values to NaN after computing the full-grid loss (to keep the
num_grid_nodesshape needed byplot_spatial_error) and switch totorch.nanmeanwhen averaging over samples inon_test_epoch_end.Issue Link
closes #569
Type of change
Checklist before requesting a review
Checklist for reviewers
Author checklist after completed review
Checklist for assignee