Point obs height corrections

When analyzing any forecast model to point obs for surface temperature, we need to do a correction (e.g. ~6.5K/km) to account for height differences between the actual height of the observing platform and the forecast grid's height.

The spatiotemporal interpolations happen far into the pipeline so I'm inclined to advocate against a preprocessing approach, after thinking about it for a bit. The correction should occur after interpolation but before metric computations. I also want to try to integrate this computation in a way that allows for others to work in the same stage of processing, such as weighting (if it is the optimal place in the DAG to stick it, for instance).

https://github.com/brightbandtech/ExtremeWeatherBench/blob/2f37abd464eb339d3775e2e1537e2f823d87efad/src/extremeweatherbench/evaluate.py#L885-L904

Two thoughts for it to go in the pipeline here:

**After `maybe_derive_variables()`**

I think this method is the easiest; put a drop-in `postprocess()` after in the pipeline. I would do some more rigorous testing to confirm differing patterns work as expected. [Earthmover's recent blog post using Hypothesis](https://www.earthmover.io/blog/engineering-rigor-in-icechunk-part-1) is a great reference; would love to spend some time building this out. That level of rigor isn't critical though.

The only risk I see here is the DAG becoming significantly larger that would cause issues slowing down computation.

**Replace `maybe_derive_variables()` with `postprocess()` handler**

This method might be the more elegant long-term solution along with a rework for `maybe_derive_variables()` that would involve a stricter and more invariant-based workflow (e.g. requiring a derived variable to only return a DataArray).


	valid_data = (
	inputs.maybe_subset_variables(
	data,
	variables=input_data.variables,
	source_module=source_module,
	)
	.pipe(
	lambda ds: input_data.subset_data_to_case(ds, case_metadata, **kwargs)
	)
	.pipe(input_data.maybe_convert_to_dataset)
	.pipe(input_data.add_source_to_dataset_attrs)
	.pipe(
	lambda ds: derived.maybe_derive_variables(
	ds,
	variables=input_data.variables,
	case_metadata=case_metadata,
	**kwargs,
	)
	)
	)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Point obs height corrections #371

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Point obs height corrections #371

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions