Skip to content

Mask boundary nodes in spatial loss maps during test_step#568

Open
RajdeepKushwaha5 wants to merge 2 commits intomllam:mainfrom
RajdeepKushwaha5:fix/spatial-loss-maps-interior-mask
Open

Mask boundary nodes in spatial loss maps during test_step#568
RajdeepKushwaha5 wants to merge 2 commits intomllam:mainfrom
RajdeepKushwaha5:fix/spatial-loss-maps-interior-mask

Conversation

@RajdeepKushwaha5
Copy link
Copy Markdown
Contributor

@RajdeepKushwaha5 RajdeepKushwaha5 commented Apr 1, 2026

Describe your changes

The spatial_loss call in test_step did not pass mask=self.interior_mask_bool, unlike every other self.loss() call in training_step, validation_step, and test_step itself. This caused boundary grid nodes to be included in the spatial loss maps and the saved mean_spatial_loss.pt, inconsistent with the loss the model optimises.

Set boundary node values to NaN after computing the full-grid loss (to keep the num_grid_nodes shape needed by plot_spatial_error) and switch to torch.nanmean when averaging over samples in on_test_epoch_end.

Issue Link

closes #569

Type of change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation (Addition or improvements to documentation)

Checklist before requesting a review

  • My branch is up-to-date with the target branch
  • I have performed a self-review of my code
  • For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
  • I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
  • I have updated the README to cover introduced code changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have given the PR a name that clearly describes the change, written in imperative form
  • I have requested a reviewer and an assignee

Checklist for reviewers

  • the code is readable
  • the code is well tested
  • the code is documented (including return types and parameters)
  • the code is easy to maintain

Author checklist after completed review

  • I have added a line to the CHANGELOG describing this change, in a section reflecting type of change

Checklist for assignee

  • PR is up to date with the base branch

Copilot AI review requested due to automatic review settings April 1, 2026 16:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an inconsistency in ARModel test-time spatial loss map generation by ensuring boundary grid nodes are excluded (as they already are in the optimized training/validation losses), so saved/visualized spatial loss maps reflect the same interior-only objective.

Changes:

  • Mask boundary nodes in test_step spatial loss maps by setting boundary entries to NaN while preserving the full-grid shape for plotting.
  • Switch aggregation in on_test_epoch_end from torch.mean to torch.nanmean so boundary-node NaNs are ignored when computing the mean spatial loss map.
  • Add a corresponding entry to the unreleased CHANGELOG.md.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
neural_lam/models/ar_model.py Masks boundary nodes in test spatial loss maps and uses nanmean when aggregating, aligning evaluation maps with interior-masked losses.
CHANGELOG.md Documents the bug fix in the unreleased “Fixed” section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CHANGELOG.md Outdated

### Fixed

- Mask boundary nodes in spatial loss maps computed during `test_step`, consistent with all other loss calls that use `interior_mask_bool` [\#TBD](https://github.com/mllam/neural-lam/pull/TBD) @RajdeepKushwaha5
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CHANGELOG entry still contains a placeholder PR reference ("#TBD" and a "/pull/TBD" URL). Please replace this with the actual PR number (and/or the real issue link) before merging so the link is valid in release notes.

Suggested change
- Mask boundary nodes in spatial loss maps computed during `test_step`, consistent with all other loss calls that use `interior_mask_bool` [\#TBD](https://github.com/mllam/neural-lam/pull/TBD) @RajdeepKushwaha5
- Mask boundary nodes in spatial loss maps computed during `test_step`, consistent with all other loss calls that use `interior_mask_bool` @RajdeepKushwaha5

Copilot uses AI. Check for mistakes.
Comment on lines 447 to +451
spatial_loss = self.loss(
prediction, target, pred_std, average_grid=False
) # (B, pred_steps, num_grid_nodes)
# Exclude boundary nodes, consistent with training/validation loss
spatial_loss[..., ~self.interior_mask_bool] = float("nan")
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage: this change introduces NaN-masking for boundary nodes and switches the aggregation to nanmean, but there doesn’t appear to be a unit/integration test asserting that boundary nodes are excluded from the saved/plot-ready spatial loss maps. Adding a small test (e.g., with a minimal module/mocked masks similar to existing ARModel tests) would prevent regressions in evaluation outputs.

Copilot uses AI. Check for mistakes.
The spatial_loss call in test_step did not pass mask=self.interior_mask_bool,
unlike every other loss call in training_step, validation_step, and test_step.
This caused boundary grid nodes to be included in spatial loss maps and the
saved mean_spatial_loss.pt, inconsistent with the loss the model optimises.

Set boundary node values to NaN after computing the full-grid loss (to keep
the num_grid_nodes shape needed by plot_spatial_error) and switch to
torch.nanmean when averaging over samples in on_test_epoch_end.

Co-authored-by: GitHub Copilot
@RajdeepKushwaha5 RajdeepKushwaha5 force-pushed the fix/spatial-loss-maps-interior-mask branch from bd5464c to 0c7fe9a Compare April 1, 2026 16:14
@sadamov sadamov added the enhancement New feature or request label Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] spatial_loss_maps in test_step computed without interior_mask, inconsistent with all other loss calls

3 participants