Release v1.1.0: Pyproject migration, Pixi support, and CMIP7 CVs by pgierz · Pull Request #231 · esm-tools/pycmor

pgierz · 2025-11-05T17:57:00Z

Summary

This PR consolidates several major improvements for the pycmor 1.1.0 release:

Major Changes

✅ PR #212 - Pyproject Migration

Migrates from setup.py/setup.cfg to modern pyproject.toml configuration
Consolidates all project metadata and dependencies
Maintains versioneer for version management
Updates CI configuration to exclude CMIP7 DReq submodule from linting

✅ PR #224 - Pixi Support

Adds pixi package manager support with pixi.lock for reproducible environments
Configures pixi in pyproject.toml with conda-forge channel
Supports Python 3.10-3.12 on osx-arm64, osx-64, linux-64
Installs pycmor as editable PyPI dependency
Includes dev environment with pytest

✅ PR #222 - CMIP7 Controlled Vocabularies Implementation

Adds CMIP7-CVs as git submodule (WCRP-CMIP/CMIP7-CVs, src-data branch)
Enhances ControlledVocabularies class to support both CMIP6 and CMIP7
Adds comprehensive unit tests for CV functionality
Updates yamllint config to exclude CMIP7-CVs directory
Adds documentation for CMIP7 CV implementation

Already Incorporated

The following PRs were already merged into prep-release:

PR Fix frequency inference for monthly data with missing months #199 - Fix frequency inference for monthly data with missing months
PR Towards data science - medium article - infer frequency #204 - Infer frequency documentation (jupyter notebook and blog post)
PR Fix: pkg_resources #226 - Fix pkg_resources compatibility
PR Better Logs #223 - Better logging improvements

Breaking Changes

Minimum Python version may change based on pixi configuration
Package now uses modern pyproject.toml (backwards compatible for users)

Testing

CI will run on Python 3.9, 3.10, 3.11, 3.12
All linting checks (black, isort, flake8, yamllint)
Full test suite including integration tests

Checklist

Merged PR Pyproject Migration #212 (Pyproject Migration)
Merged PR Pixi Support #224 (Pixi Support)
Merged PR Add CMIP7 Controlled Vocabularies Implementation #222 (CMIP7 CVs)
Submodule CMIP7-CVs initialized
CI passes
Manual testing with ~~pixi~~ conda environment
Documentation builds successfully

- Add missing imports to config.py doctest examples - Convert file-writing example to code-block to avoid side effects - Add proper imports (xarray, numpy) to bounds.py doctest examples - Add Rule object creation in __init__.py doctest example for add_vertical_bounds - Change print() assertions to direct boolean checks for cleaner output

- Update config.py expected xarray_engine from netcdf4 to h5netcdf (matches Dockerfile env) - Add +ELLIPSIS directive to bounds functions to ignore INFO log output - Keeps tests strict on actual functionality while allowing log format variations

…n pipelines - Replace matrix-based jobs with individual named jobs per Python version - Each version now flows independently: build-X-Y → meta-X-Y → [unit, integration, doctest]-X-Y - Python 3.9 can complete entire pipeline while 3.12 is still building - Reduces pipeline latency and improves parallelization - Total jobs: 4 builds + 16 tests (4 versions × 4 test types)

- Add CMIP7_DReq_Software and cmip6-cmor-tables to flake8 exclusions - Add both submodules to isort skip list - Update black exclude pattern to cover all three submodules - Prevents linting failures from third-party code in git submodules

- Use ellipsis wildcards in expected output lines instead of bare '...' - Match actual logging output structure with '...INFO → message...' - Avoids doctest ambiguity where '...' is interpreted as continuation prompt - Properly validates that bounds are added while allowing variable formatting

- Set PYTHONLOGLEVEL=CRITICAL for all doctest jobs in CI - Prevents logging output from interfering with doctest expected output - Cleaner solution than modifying doctest examples or pytest config - Applies to all four Python versions (3.9, 3.10, 3.11, 3.12)

- Add Docker login step to authenticate with ghcr.io - Push images with two tags per Python version: - ghcr.io/esm-tools/pycmor-testground:py3.X-<commit-sha> - ghcr.io/esm-tools/pycmor-testground:py3.X-<branch-name> - Upload images as workflow artifacts for same-run access - Enables reproducible test environments via container registry - Infrastructure as Code: Dockerfile.test defines test infrastructure

Document the Infrastructure as Code approach for test environments: - Container image publishing to GitHub Container Registry - Tagging scheme for reproducibility (commit SHA, branch, semver) - CI/CD workflow for building and distributing testgrounds - Local usage examples for developers - Future improvements (conditional publishing, cleanup policies, multi-arch) - Infrastructure as Code principles and traceability - Troubleshooting guide for common issues The testground system treats test infrastructure as code, with Dockerfile.test as the declarative specification and container images as infrastructure artifacts.

- Use substring(github.sha, 0, 7) for 7-character short SHA - Use github.head_ref || github.ref_name to get actual branch name in both PR and push contexts (avoids '231/merge' format) This fixes the invalid tag error caused by github.ref_name returning '231/merge' for pull requests instead of the source branch name.

Remove tar-based artifact workflow in favor of direct GHCR pulls. Changes: - Remove load: true and local unprefixed tags from all build jobs - Remove tar export, cache, and artifact upload steps - Update all test jobs to pull directly from ghcr.io - Add GHCR login to all test jobs Benefits: - Fixes Docker Hub authentication error (no unprefixed tags) - Simplifies workflow (-60 lines) - Better performance (GHCR layer caching vs tar artifacts) - Perfect CI/local parity - same images available locally

Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.9.0 to 1.13.0. - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@v1.9.0...v1.13.0) --- updated-dependencies: - dependency-name: pypa/gh-action-pypi-publish dependency-version: 1.13.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>

…github/workflows/pypa/gh-action-pypi-publish-1.13.0 chore(deps): bump pypa/gh-action-pypi-publish from 1.9.0 to 1.13.0 in /.github/workflows

The >>> prompts in the .. code-block:: python directives were being interpreted by pytest's --doctest-modules as actual doctests, even though they're inside Sphinx code-block directives. This caused doctest failures due to the ... continuation lines being misinterpreted as doctest continuation markers. Changed the examples to plain Python code without the interactive prompts, which is more appropriate for Sphinx code-block directives anyway. The examples are now for documentation purposes only, not executable doctests. Fixes doctest errors in: - add_bounds_from_coords() - add_vertical_bounds()

The logger was hardcoded to INFO level, which meant that legitimate INFO log statements (side effects of normal operation) would appear in doctest output even when PYTHONLOGLEVEL=CRITICAL was set in CI. Now the logger respects the PYTHONLOGLEVEL environment variable, allowing doctests to run with logging suppressed while keeping the logging statements in the actual code (which is correct - logging is a valid side effect). Changes: - Read PYTHONLOGLEVEL from environment, default to INFO if not set - Apply the log level when configuring the RichHandler - This allows CI doctest runs to suppress all logs below CRITICAL

Re-added doctest prompts (>>>) to bounds.py examples now that logging is properly suppressed via PYTHONLOGLEVEL. The examples now show both input and output datasets with structured representations, making it much easier to understand what the functions do. Changes: - Restored >>> prompts for executable doctests - Added print() statements for input datasets before transformation - Added print() statements for output datasets after transformation - Used doctest directives (+ELLIPSIS, +NORMALIZE_WHITESPACE) for flexibility - Shows full xarray Dataset structure: dimensions, coordinates, data variables This provides clear before/after visualization while maintaining executable tests that verify the functions work correctly.

ARM64 builds take 3-4x longer due to QEMU emulation, so make them optional to speed up CI. Builds now default to linux/amd64 only. To build ARM64 images: 1. Go to Actions tab in GitHub 2. Select 'Run Basic Tests' workflow 3. Click 'Run workflow' 4. Check the 'Build ARM64 images' option This allows: - Fast CI for most PRs and commits (amd64 only) - Manual ARM64 builds when needed for M1/M2/M3 Mac users - ARM64 builds still happen on tags (for releases) Changes: - Add workflow_dispatch trigger with build_arm64 boolean input - Conditionally set platforms based on input (defaults to amd64 only) - Applied to all 4 Python version build jobs

- Implement three chunking algorithms (simple, even_divisor, iterative) inspired by dynamic_chunks library - Add chunking module (src/pycmor/std_lib/chunking.py) with functions for calculating optimal chunk sizes based on target size and access patterns - Integrate chunking into save_dataset() with automatic encoding generation - Add 7 new configuration options for chunking and compression control - Support global and per-rule chunking configuration via YAML - Include comprehensive test suite (13 tests, all passing) - Add user documentation with examples and troubleshooting guide - Default: 100MB chunks, time-dimension preference, level 4 compression This enables users to optimize NetCDF file I/O performance by configuring internal chunking strategies that match their data access patterns.

…ss() - Replace auto-import with enable_xarray_accessor() for lazy registration - Add _build_rule() helper for interactive Rule construction - Add StdLibAccessor with tab-completable std_lib steps via ds.pycmor.stdlib - Add .process() method for running full pipelines interactively - Add BaseModelRun ABC in pycmor.tutorial for test infrastructure - Update existing tests to use enable_xarray_accessor() - Add comprehensive test suite in test_accessor_api.py

# Conflicts: # src/pycmor/core/cmorizer.py

- Add required compound_name field to all CMIP7 test config rules (validator requires it for cmor_version=CMIP7) - Add setuptools to Dockerfile.test (pyfesom2 imports pkg_resources)

…nfigs

…pkg_resources

The vendored all_var_info.json does not populate cmip7_compound_name or cmip6_compound_name on DRVs. So variable_id falls back to the short name (e.g., "tas"). The matching logic compared the full compound name "Amon.tas" against the plain "tas" when only one side had a dot, which always failed. Fix: always extract the short name from compound_name for comparison, regardless of whether the DRV also has dots. Also add a fallback match against drv.name directly. Add CMIP7 DRV fixtures (dr_cmip7_tas, dr_cmip7_thetao) for testing.

Pipeline._run_prefect() now uses return_state=True and checks for failures, re-raising the original exception. Previously, Prefect swallowed exceptions via on_failure callbacks that only logged. CMORizer._parallel_process_prefect() also checks both the flow-level state and individual rule future states for failures. This ensures integration tests correctly fail when pipeline steps raise exceptions.

DefaultPipeline had both handle_unit_conversion (correct pipeline step taking data+rule) and units.convert (low-level function taking da+from_unit+to_unit). The latter was called with (data, rule) args, causing ParameterBindError: missing required argument 'to_unit'. handle_unit_conversion already calls convert() internally, so the duplicate step was both wrong and redundant.

- dimension_mapping.py: use getattr(rule, "dimension_mapping") instead of rule._pycmor_cfg("dimension_mapping", default={}) -- dimension_mapping is a rule attribute, not a config option, and everett rejects non-string defaults - CMIP7 test configs: add activity_id="CMIP" to rules that need it for global attribute generation - cmorizer.py: fix parallel error checking to handle both PrefectFuture and State objects from different Prefect versions

…_run - dimension_mapping.py: check isinstance(user_mapping, dict) to handle Mock objects in tests (getattr on Mock returns Mock, not None) - base_model_run.py: convert doctest example to code-block to prevent pytest from trying to execute it

… not a dict

…aset

@mzapponi

Cherry-picked from PR #194 by @mzapponi (adapted for src/pycmor/ paths): - gather_inputs.py: if rule has time_dimname and dataset uses that dimension instead of "time", rename it automatically on load - pipeline.py: defensive getattr for _cluster attribute Co-authored-by: Martina Zapponi <mzapponi@users.noreply.github.com>

fix: accessor API with lazy registration and BaseModelRun infrastructure

…ssing

…o disk

pgierz and others added 30 commits November 5, 2025 22:24

wip

7d7c9c8

wip

54f9927

fix: exclude all submodules from linting checks

f58fdb1

- Add CMIP7_DReq_Software and cmip6-cmor-tables to flake8 exclusions - Add both submodules to isort skip list - Update black exclude pattern to cover all three submodules - Prevents linting failures from third-party code in git submodules

wip: still trying to get the damn doctests

034419d

wip(bounds.py): removes the quasi-truncated example

494f822

wip: setting up for tag 1.1.0 (w/o CMIP7)

de74f68

doc: minor cleanups

e0facd7

doc: fix badge link for PyPI version in README

b018844

wip...

9e03806

merge main for better var namespacing in environment

7704a2c

style: shut up, linter...grr

d1bc883

Merge pull request #232 from esm-tools/dependabot/github_actions/dot-…

c1988cf

…github/workflows/pypa/gh-action-pypi-publish-1.13.0 chore(deps): bump pypa/gh-action-pypi-publish from 1.9.0 to 1.13.0 in /.github/workflows

wip: allows builds for arm (arms and legs!)

8d7fdb2

doc: fleshes out doctests

59dc848

wip: FOUND THE BUGGER

2449276

pgierz and others added 30 commits December 12, 2025 09:54

wip

7521faa

wip for cli debugger

80223e2

wip

0d98352

...

8ff2dee

...

89955b2

better short name logic

8628d57

isort

9fe6a4b

better validators

7a7d262

Merge remote-tracking branch 'origin/prep-release' into fix/gh250

3d950a9

# Conflicts: # src/pycmor/core/cmorizer.py

style: fix trailing whitespace and isort

0c78b9a

fix: update pi_uxarray download URL (old Nextcloud share expired)

492ca1c

fix: add compound_name to CMIP7 test configs and setuptools to Docker

3b2af74

- Add required compound_name field to all CMIP7 test config rules (validator requires it for cmor_version=CMIP7) - Add setuptools to Dockerfile.test (pyfesom2 imports pkg_resources)

fix: use CMIP6-style compound names (Table.variable) in CMIP7 test co…

fa75da0

…nfigs

fix: guard pyfesom2 imports in tests to avoid collection errors

0ea98cf

fix: guard pyfesom2 import in regridding.py for environments without …

66cf054

…pkg_resources

fix: skip pyfesom2-dependent tests when pyfesom2 is not importable

11f5bbe

fix: fall back to _pycmor_cfg for dimension_mapping when rule attr is…

ff766ee

… not a dict

fix: derive table_id from CMIP6-style compound names (Table.variable)

51a00e1

fix: filter None values from global attributes before applying to dat…

6ffc348

…aset

fix: pin sphinx<9 for sphinx_toolbox compatibility

8f8f7a5

Merge PR #256: Accessor API, lazy registration, CMIP7 fixes

049a206

fix: accessor API with lazy registration and BaseModelRun infrastructure

fix: unwrap Prefect State objects to actual results in parallel proce…

8f3d8f4

…ssing

fix: save_dataset returns the dataset instead of None after writing t…

3331d4c

…o disk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v1.1.0: Pyproject migration, Pixi support, and CMIP7 CVs#231

Release v1.1.0: Pyproject migration, Pixi support, and CMIP7 CVs#231
pgierz wants to merge 286 commits intomainfrom
prep-release

pgierz commented Nov 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pgierz commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Major Changes

✅ PR #212 - Pyproject Migration

✅ PR #224 - Pixi Support

✅ PR #222 - CMIP7 Controlled Vocabularies Implementation

Already Incorporated

Breaking Changes

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pgierz commented Nov 5, 2025 •

edited

Loading