Skip to content

Data catalog for HighResMIP (compatibility with CMIP convention?) #815

@weiming9115

Description

@weiming9115

What problem will this feature solve?
This request addresses compatibility issues between the MDTF framework and HighResMIP simulations (a subset of CMIP). The catalog CSV file generated by catalog_builder.py cannot be successfully processed by preprocessor.py, even though the variables and data frequency follow the same conventions used by another CMIP simulation that runs correctly.
HighResMIP datasets appear to follow the CMIP conventions. However, there may be subtle differences that I have not identified, which could be causing the unexpected error shown below

Describe the solution you'd like
I would like to request support for an additional convention option, such as highresmip, alongside the currently supported options (cmip, gfdl, and cesm).
If there are convention differences between HighResMIP and standard CMIP outputs, corresponding preprocessing procedures could be implemented for HighResMIP, similar to those already established for CMIP datasets.

Describe alternatives you've considered
N/A

Additional context
@aradhakrishnanGFDL @bitterbark
I am reporting this as a feature request because it is unclear whether the issue represents a bug or simply a feature that has not yet been implemented. Previous PODs appear to have been developed primarily using standard CMIP outputs, rather than HighResMIP datasets.
The first error indicates:
"No cmip fieldlist entry found..."
However, the HighResMIP catalog CSV file was successfully generated when the cmip convention was specified (see attached), which makes the subsequent preprocessing failure somewhat confusing. Thank you!


ERROR: No cmip fieldlist entry found for variable hus ERROR: translation for varlistEntry hus failed Preprocessing data for MCS_precip_buoy_stats v: <#None:MCS_precip_buoy_stats.pr> (='pr' @ 6hr) v type: <class 'src.varlist_util.VarlistEntry'> v: <#None:MCS_precip_buoy_stats.rlut> (='rlut' @ 6hr) v type: <class 'src.varlist_util.VarlistEntry'> v: <#None:MCS_precip_buoy_stats.ta> (='tas' @ 6hr) v type: <class 'src.varlist_util.VarlistEntry'> v: <#None:MCS_precip_buoy_stats.hus> (='(not translated)' @ 6hr) v type: <class 'src.varlist_util.VarlistEntry'> Querying /scratch/wmtsai/mdtf_miniforge/catalogs/EC-Earth3P-HR_historical.json for variable pr for case EC-Earth3P-HR_historical. WARNING: /scratch/wmtsai/conda/envs/_MDTF_base/lib/python3.12/site-packages/intake_esm/_search.py:50: UserWarning: This pattern is interpreted as a regular expression, and has match groups. To actually get the groups, use str.extract. mask = df[column].str.contains(value, regex=True, case=True, flags=0) CRITICAL: ********************************************************************** Uncaught exception: Traceback (most recent call last): File "/scratch/wmtsai/mdtf_miniforge/MDTF-diagnostics/mdtf_framework.py", line 244, in exit_code = main(prog_name='MDTF-diagnostics') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/scratch/wmtsai/conda/envs/_MDTF_base/lib/python3.12/site-packages/click/core.py", line 1157, in call return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/scratch/wmtsai/conda/envs/_MDTF_base/lib/python3.12/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/scratch/wmtsai/conda/envs/_MDTF_base/lib/python3.12/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/scratch/wmtsai/conda/envs/_MDTF_base/lib/python3.12/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/scratch/wmtsai/conda/envs/_MDTF_base/lib/python3.12/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/scratch/wmtsai/mdtf_miniforge/MDTF-diagnostics/mdtf_framework.py", line 200, in main cat_subset = data_pp.process(cases, ctx.config, model_paths.MODEL_WORK_DIR) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/scratch/wmtsai/mdtf_miniforge/MDTF-diagnostics/src/preprocessor.py", line 1649, in process cat_subset = self.query_catalog(case_list, config.DATA_CATALOG) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/scratch/wmtsai/mdtf_miniforge/MDTF-diagnostics/src/preprocessor.py", line 1197, in query_catalog raise util.DataRequestError( src.util.exceptions.DataRequestError: Unable to find match or alternate for pr for case EC-Earth3P-HR_historical in /scratch/wmtsai/mdtf_miniforge/catalogs/EC-Earth3P-HR_historical.json

EC-Earth3P-HR_historical.csv
MPI-ESM1-2-HR_historical_r11i1p1f1.csv

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions