Skip to content

maintenance updates#125

Merged
jameshiebert merged 31 commits intomasterfrom
py-sprint-25q1
Jun 11, 2025
Merged

maintenance updates#125
jameshiebert merged 31 commits intomasterfrom
py-sprint-25q1

Conversation

@corviday
Copy link
Copy Markdown
Contributor

@corviday corviday commented May 6, 2025

Task list:

  • poetry-ize
  • update to sqlalchemy 2.0
  • upgrade github scripts, test and build poetry 3.10, 3.11, 3.12
  • upgrade from make to poe
  • fix (most of) deprecation warnings during testing
  • upgrade to PEP 621
  • upgrade tests to ubuntu 24.04?
  • fix table relationship warnings

resolves #124
resolves #110

black should be run after review and before merge, but we don't need a bunch of formatting changes cluttering up this PR.

@corviday corviday marked this pull request as draft May 6, 2025 19:18
@corviday corviday marked this pull request as ready for review May 8, 2025 21:52
Copy link
Copy Markdown
Contributor

@jameshiebert jameshiebert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the various warnings that SQLAlchemy is complaining about the various relationships not being properly specified. If we're going through the trouble of doing the upgrades, these warnings should be much more sparse than they are. Can you double check that the relationships are coded correctly? The warnings advise "consider if these relationships should be linked with back_populates, or if viewonly=True should be applied to one or more if they are read-only"

Comment thread README.md
Comment thread tests/conftest.py Outdated
Comment thread tests/mm_cataloguer/test_associate_ensemble.py
jameshiebert and others added 3 commits May 21, 2025 11:20
An attempt to mitigate the following errors:

Exception ignored in: 'netCDF4._netCDF4.Dataset.__dealloc__'
AttributeError: __getattribute__
AttributeError: __getattribute__

This patch includes code to attempt to avoid memoizing open NetCDF
files as well as some timing code to benchmark test results using (or
not) memoization.
The memoization code in index_netcdf.py was both problematic and
ineffective. As part of the keys to the cache, it used open NetCDF
file objects and as part of the values of the cache were Netcdf
variable objects (which also contained references to the NetCDF
files). On exit, the main branch of the code would close and clean up
the files, leaving everything in the cache stranged. When the cache
got cleaned up, it would leave a long string of error messages:

Exception ignored in: 'netCDF4._netCDF4.Dataset.__dealloc__'
AttributeError: __getattribute__
AttributeError: __getattribute__

for each stranded object. While this didn't necessarily cause any real
errors at execution time, it was messy.

Furthermore, the expensive part of indexing is I/O: going out to read
data from the disk and going out to the database. The memoization, as
written, didn't actually mitigate this. It only cached objects,
referencing the file. I measured the differences in the test suite
between memoized and non-memoized and it was on the order of
miliseconds (noise). I recommend simply removing this code, as this
patch does.
@corviday
Copy link
Copy Markdown
Contributor Author

Demo using this versiom, with data indexed by this version. Seems to be working.

@jameshiebert
Copy link
Copy Markdown
Contributor

Amazing! Everything looks great to me. Thanks so much @corviday . I'll go ahead and merge this, run black and do the release.

@jameshiebert jameshiebert merged commit 63b6e72 into master Jun 11, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Maintenance updates sqlalchemy overlap warning

2 participants