Skip to content

Conversation

@Zeitsperre
Copy link
Collaborator

Description

This is a testing PR to figure out what dependency combination may be causing the testing failures. Be sure to squash and merge commits when a solution is found.

To-Do List

  • Determine what the issues around the now-failing tests are
  • Determine whether the issues are fixable within climpred or xskillscore
  • Remove intermittent GitHub Workflow steps to install xskillscore@main
  • Update Changelog

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

TBD

References

See #870

@codecov
Copy link

codecov bot commented Oct 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (updates-2025@60bf4ec). Learn more about missing BASE report.

Additional details and impacted files
@@               Coverage Diff               @@
##             updates-2025     #880   +/-   ##
===============================================
  Coverage                ?   91.55%           
===============================================
  Files                   ?       59           
  Lines                   ?     6261           
  Branches                ?        0           
===============================================
  Hits                    ?     5732           
  Misses                  ?      529           
  Partials                ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Zeitsperre
Copy link
Collaborator Author

@aaronspring No tests failing with numpy@v1.26 and xarray@v2023.12.0 (doctests are only failing due to changes in the presentation of DataArrays). Will try some other combinations as well.

@Zeitsperre
Copy link
Collaborator Author

No tests failing with numpy@v1.26 and numpy@v2.3 and xarray<2025.0. Will start looking over the changelogs for xarray this year, but I have a hunch that one way to solve this would be to tackle the DeprecationWarnings arising from the tests that are failing in more recent versions.

@Zeitsperre
Copy link
Collaborator Author

Zeitsperre commented Oct 21, 2025

@aaronspring

Comparing the builds, I can see that the regression shows up sometime between xarray@v2024.11.0 and xarray@v2025.1.2 (nothing to do with numpy).

I'm not sure how to chase it down further than that. I do recall the jump to xarray v2025.x being somewhat complicated for us with xclim due to changes with cf-xarray, but those have been dealt with in more recent releases.

Hopefully this helps with the debugging efforts.

@Zeitsperre
Copy link
Collaborator Author

Hey @aaronspring, happy new year!

I just wanted to point out that this PR is the one we should play around with in order to get to this bottom of these failing tests. I think we're close, and if we can resolve this, most of the open Pull Requests can finally be merged, domino-style.

@aaronspring
Copy link
Collaborator

agreed. I lost track and wanted to start fresh to see whether an error crept in early therefore starting #882. I'm still unsure about a testing and verifying strategy. maybe we can draft such a plan here and only afterwards execute the plan?

@Zeitsperre
Copy link
Collaborator Author

For sure, I offered this PR since most of the work to bring the code up-to-date is already done in #870.

I'm somewhat convinced that the relatively recent changes from xarray that touched how attrs are saved might be responsible for the errors. Whatever plan we come up with will likely require changes in xskillscore as well.

@aaronspring
Copy link
Collaborator

aaronspring commented Jan 16, 2026

py3.9 passing was puzzling to me but now I understand

Analysis Summary

Root Cause: The failures are NOT due to Python version differences. The real issue is NumPy 2.x compatibility in xskillscore v0.0.27.

Why py3.9 passes but py3.10-3.13 fail:

  • py3.9 → defaults to NumPy 1.x → tests pass
  • py3.10-3.13 → default to NumPy 2.x → tests fail
    The actual problem: xskillscore v0.0.27 added np.atleast_1d() calls that changed numerical results for p-value calculations, causing doctest failures.

Fix Plan

  1. Wait for xskillscore v0.0.28 (PR DeprecationWarning: getattr uses apply, but apply deprecated, should use map #440 already merged - removes problematic np.atleast_1d() calls)
  2. Update climpred CI to require xskillscore >= 0.0.28 and remove the FIXME workaround
  3. Remove skip logic in conftest.py for numpy>=2.0 + xarray<=2024.6.0 once dependencies align

Should you drop py3.9?
No. Here's why:

  1. py3.9 is not the cause - it's a NumPy version issue, not Python
  2. py3.9 provides valuable regression testing against NumPy 1.x behavior
  3. Many users (especially on HPC systems) still run Python 3.9
  4. The fix is already available - just wait for xskillscore v0.0.28 release
    The correlation with Python versions is misleading - check climpred_testing.yml which already has a FIXME workaround installing xskillscore from git main to bypass the broken v0.0.27.

https://opncd.ai/share/YRNcwidm

why didn't we publish xskillscore 0.0.28? @Zeitsperre

@aaronspring
Copy link
Collaborator

For sure, I offered this PR since most of the work to bring the code up-to-date is already done in #870.

I'm somewhat convinced that the relatively recent changes from xarray that touched how attrs are saved might be responsible for the errors. Whatever plan we come up with will likely require changes in xskillscore as well.

Maybe we need to go test by test. Which test in particular do you think is impacted by xr attrs?

@Zeitsperre
Copy link
Collaborator Author

why didn't we publish xskillscore 0.0.28?

I just need your rubber stamp on xarray-contrib/xskillscore#441 to release v0.0.28. I honestly forgot to check in on it.

Maybe we need to go test by test. Which test in particular do you think is impacted by xr attrs?

Can start taking a look next week. Would love to get to the bottom of this. Perhaps I can tag in a colleague to help.

@aaronspring
Copy link
Collaborator

aaronspring commented Jan 17, 2026

Count me in also. Can devote some time to debug, triage and decide

@Zeitsperre
Copy link
Collaborator Author

Zeitsperre commented Jan 19, 2026

@aaronspring xskillscore v0.0.28 is out on PyPI. It just needs a release on conda-forge now; Would you be comfortable adding me to the feedstock?

Update: I pushed a few changes/workarounds to deal with the attributes' behaviour in #883, as well as updated the doctest outputs. I needed to pin numpy below v2.4.0 in xskillscore due to breaking changes in the API, so that pin will propagate here. I'll speak with a colleague about the remaining issues this week if I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants