Skip to content

398 add run rhime run rhime multisector and a cli entry point enhancement#420

Merged
brendan-m-murphy merged 5 commits into
develfrom
398-add-run-rhime-run-rhime-multisector-and-a-cli-entry-point-enhancement
May 17, 2026
Merged

398 add run rhime run rhime multisector and a cli entry point enhancement#420
brendan-m-murphy merged 5 commits into
develfrom
398-add-run-rhime-run-rhime-multisector-and-a-cli-entry-point-enhancement

Conversation

@brendan-m-murphy
Copy link
Copy Markdown
Contributor

@brendan-m-murphy brendan-m-murphy commented May 1, 2026

  • Summary of changes

Summary

Closes #398.

This PR adds modern RHIME public runners alongside the legacy fixedbasisMCMC path:

  • Adds run_rhime(...) for standard single-sector RHIME inversions.
  • Adds run_rhime_multisector(...) for shared-basis multi-sector RHIME inversions.
  • Adds openghg-inversions run-rhime ... and openghg-inversions run-rhime-multisector ... CLI entry points for config-file driven runs.
  • Adds lightweight RHIME result/spec dataclasses and a RHIME config template.
  • Adds direct modern InversionOutput construction for the standard RHIME path.
  • Adds sector-aware diagnostic output for multi-sector runs.

The new RHIME runners reuse the existing data preparation and component-based PyMC model pieces, but do not route public modern behavior through fixedbasisMCMC or the legacy inferpymc output adapter.

Notes

Multi-sector RHIME currently supports shared basis regions across sectors, but each sector keeps its own flux field and state vector. The PR therefore only creates modern sector diagnostics for multi-sector output; full PARIS/basic postprocessing for multi-sector runs should be handled in a follow-up with a sector-aware output adapter.

This PR intentionally does not add tracer support, 6 km support, or a full config system redesign.

This PR does not add full PARIS outputs for the multi-sector model, just a proof-of-concept output including posterior flux means.

Notes for reviewers

The main new code is in the top-level rhime.py, which contains parallel inversion run paths to fixedbasisMCMC.

The code in models/rhime.py somewhat duplicates the code in hbmcmc/inversion_pymc.py.

In general, this is a fairly messy PR, but I wanted to get something I could test on real data. There are further PRs on the milestone for multi-sector models https://github.com/openghg/openghg_inversions/milestone/9 that will clean up the code.

How to test on real data

Setting up slurm.sh

The new run_rhime function runs via a CLI entrypoint (instead of a script like run_hbmcmc.py):

openghg-inversions run-rhime -c "${CONFIG_FILE}" --output-path "${RUN_ROOT}"

You do not need to use variables like CONFIG_FILE and RUN_ROOT, that is just what I did. You can add start and end dates as positional arguments like with run_hbmcmc.py.

To get this to work, you might need to update your environment. I'm using uv, so after I checked out this branch, I did uv sync.

The part of my slurm.sh that activates this venv looks like:

module purge
module load git
source "${REPO_DIR}/.venv/bin/activate"

where for me, REPO_DIR=/user/work/bm13805/openghg_inversions, which is where I checked out the branch and ran uv sync.

Note that once you have activated your venv, the command openghg-inversions will be available. You can check this by calling

openghg-inversions -h

after you sync and activate your venv.

config file

The new run_rhime function (which is called by openghg-inversions run-rhime) accepts a subset of what run_hbmcmc.py accepts. This is to try to keep the first implementation small(ish).

The default output format is a netCDF saved from the InversionOutput object. You can also use output_format = "paris" (the full set of options is: "none", "inv_out", "basic", "paris", and "inv_out" is the default).

Instead of passing reparameterise_log_normal=True, you should add reparameterise=True to the prior dictionary, e.g.:

xprior = {"pdf": "lognormal", "stdev": 1.0, "reparameterise": True}

This is supported by run_hbmcmc.py and fixedbasisMCMC too, and reparameterise_log_normal just adds the reparameterise key to the prior args dictionary.

You need to remove mcmc_type from your ini file (although perhaps we should add this back with the new options).

Also, other deprecated arguments include calculate_min_error, which was deprecated in favour of just passing the method name (e.g. "residual", "percentile") to min_error.

If any arguments are not accepted, you'll get an error message saying which ones.

Multi-sector inversions

For multi-sector inversions, use the command

openghg-inversions run-rhime-multisector -c "${CONFIG_FILE}" --output-path "${RUN_ROOT}"

in your slurm sbatch script.

The same rules for configs apply, except that you can't specify an output format, and you can specify different priors for different sectors using a dictionary from sector names to prior args dicts.

I used the following as a test:

sector_priors = {"edgar-annual-total": {"pdf": "lognormal", "mean": 0.5, "stdev": 8.0, "reparameterise": True}, "edgarv80_wetchartsv131": {"pdf": "lognormal", "mean": 0.5, "stdev": 8.0, "reparameterise": True}}

Note that you need to be careful to close the outer braces; I forgot a } and got a weird error message (there is an issue to track this now).

Testing

  • pytest tests/test_rhime.py
  • pytest tests/test_get_data.py -k "add_averaging_error or add_obs_error"
  • pytest tests/test_inversion_inputs.py focused cases
  • Focused ruff check / ruff format --check on touched files

Repository-wide ruff still reports pre-existing unrelated lint/format issues.

  • Please check if the PR fulfills these requirements

Copilot AI review requested due to automatic review settings May 1, 2026 08:23
- Adds `run_rhime(...)` for standard single-sector RHIME inversions.
- Adds `run_rhime_multisector(...)` for shared-basis multi-sector RHIME inversions.
- Adds `openghg-inversions run-rhime ...` and `openghg-inversions run-rhime-multisector ...` CLI entry points for config-file driven runs.
- Adds lightweight RHIME result/spec dataclasses and a RHIME config template.
- Adds direct modern `InversionOutput` construction for the standard RHIME path.
- Adds sector-aware diagnostic output for multi-sector runs.

The new RHIME runners reuse the existing data preparation and component-based PyMC model pieces, but do not route public modern behavior through `fixedbasisMCMC` or the legacy `inferpymc` output adapter.
@brendan-m-murphy brendan-m-murphy force-pushed the 398-add-run-rhime-run-rhime-multisector-and-a-cli-entry-point-enhancement branch from c0f5ed6 to 5c52e09 Compare May 1, 2026 08:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds modern public RHIME runner APIs and an installed CLI entry point, enabling config-driven RHIME runs without routing through the legacy fixedbasisMCMC path.

Changes:

  • Introduces run_rhime(...) and run_rhime_multisector(...) runners returning a modern RhimeResult with specs, canonical inputs, and InferenceData.
  • Adds openghg-inversions console script with run-rhime and run-rhime-multisector subcommands.
  • Updates data prep / postprocessing glue for region-dimension traces and improves obs-error metadata handling; adds tests and a RHIME config template.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
openghg_inversions/rhime.py New modern RHIME runners, config normalization/validation, sampling, and output writing.
openghg_inversions/models/rhime.py New public RHIME model builders (single-sector + shared-basis multisector).
openghg_inversions/cli.py New argparse-based CLI wired to RHIME runners.
openghg_inversions/config/templates/rhime_template.ini New RHIME config template supporting modern parameter names.
openghg_inversions/postprocessing/make_outputs.py Supports region-dimension traces when computing flux stats.
openghg_inversions/postprocessing/inversion_output.py Deserialization updated to support region dimension as well as legacy nx.
openghg_inversions/inversion_data/get_data.py Ensures obs error variables carry consistent long_name/units attrs.
openghg_inversions/inversion_inputs.py Accepts integer min_error values as numeric scalars.
pyproject.toml Adds openghg-inversions script entry point; includes config templates in package data.
README.md Documents new Python and CLI RHIME entry points.
openghg_inversions/__init__.py Adds package __init__.
tests/test_rhime.py New tests covering model builders, config normalization, API/CLI smoke tests.
tests/test_get_data.py Tightens regression assertions around obs error long_name behavior and formatting.
tests/test_inversion_inputs.py Adds coverage for integer min_error handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread openghg_inversions/rhime.py
Comment thread openghg_inversions/rhime.py
Comment thread openghg_inversions/rhime.py Outdated
Comment thread openghg_inversions/rhime.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a modern, public RHIME execution pathway (single-sector and shared-basis multi-sector) with a first-class Python API and an installed CLI entry point, without routing through the legacy fixedbasisMCMC / inferpymc adapter path.

Changes:

  • Added modern RHIME runners (run_rhime, run_rhime_multisector) plus lightweight result/spec dataclasses and multisector diagnostics.
  • Added openghg-inversions CLI with run-rhime and run-rhime-multisector subcommands, plus a RHIME config template distributed in the package.
  • Updated postprocessing / IO compatibility for both legacy nx and modern region basis dimension naming, and expanded tests around RHIME + data error metadata.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_rhime.py New unit/smoke tests for RHIME model builders, parameter normalization/validation, API smoke runs, and CLI plumbing.
tests/test_inversion_inputs.py Adds regression coverage for accepting integer min_error.
tests/test_get_data.py Tightens assertions around observation error metadata (long_name) and applies formatting cleanups.
README.md Documents the new RHIME Python and CLI entry points and config template location.
pyproject.toml Adds console script entry point and packages the RHIME template ini file.
openghg_inversions/rhime.py Core implementation of modern RHIME runners, config normalization, sampling, and output writing.
openghg_inversions/postprocessing/make_outputs.py Adjusts stats chunking to handle region-based traces as well as legacy nx.
openghg_inversions/postprocessing/inversion_output.py Makes DataTree deserialization robust to nx vs region basis dims.
openghg_inversions/models/rhime.py Adds modern RHIME PyMC model builder(s), including multisector shared-basis variant.
openghg_inversions/inversion_inputs.py Makes min_error accept integer scalars and treats numeric scalars more robustly.
openghg_inversions/inversion_data/get_data.py Normalizes/propagates long_name + units for obs error components.
openghg_inversions/config/templates/rhime_template.ini Adds a new RHIME config template preferring flux_sources.
openghg_inversions/cli.py Implements the openghg-inversions CLI and RHIME subcommands.
openghg_inversions/__init__.py Adds package initializer docstring.
CHANGELOG.md Records the addition of modern RHIME runners + CLI + template.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread openghg_inversions/rhime.py Outdated
Comment thread openghg_inversions/rhime.py Outdated
@brendan-m-murphy brendan-m-murphy merged commit e669c89 into devel May 17, 2026
5 checks passed
@brendan-m-murphy brendan-m-murphy deleted the 398-add-run-rhime-run-rhime-multisector-and-a-cli-entry-point-enhancement branch May 17, 2026 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add run_rhime, run_rhime_multisector, and a CLI entry point

3 participants