Cloud Base Height Retrieval from NASA ER-2 Airborne Observations

Machine learning for cloud base height (CBH) retrieval using NASA ER-2 aircraft data from the WHySMIE (Oct 2024) and GLOVE (Feb 2025) campaigns, with honest assessment of performance and failure modes across atmospheric regimes.

Author: Rylan Malarchick | Embry-Riddle Aeronautical University
Contact: malarchr@my.erau.edu

Two Papers

This repository supports two complementary papers:

Paper 1: CNN-Based CBH from Thermal IR Imagery

Approach: ResNet-18 and EfficientNet-B0 on 20×22px thermal IR cutouts
Dataset: 380 samples from 7 flights (2 campaigns), 80/20 train/val 5-fold CV
Best result: ResNet-18 pretrained, R² = 0.432 ± 0.094, MAE = 172.7 ± 17.6 m
Key finding: CNNs on small thermal crops struggle — limited spatial context and small sample size yield modest performance

Paper 2: ERA5 Feature Engineering & Domain Shift

Approach: GBDT on 34 ERA5-derived features (5 base + 29 thermodynamic)
Dataset: 5,500 ocean-only BL cloud observations from 6 flights
Key finding: Catastrophic domain shift — LOFO R² = −5.36 across flights
Adaptation: Few-shot (50 samples) recovers R² = +0.35; other methods fail

Key Findings

Performance Summary (Paper 2)

Validation Strategy	R²	MAE	Assessment
Pooled 5-fold CV	−2.05	—	Inflated (cross-flight leakage)
Within-flight 5-fold CV	−0.51	—	Per-flight mean, high variance
Leave-one-flight-out (LOFO)	−5.36	518 m	True cross-regime performance
Few-shot (50 samples)	+0.35	—	Best adaptation method

Performance Summary (Paper 1)

Model	R²	MAE (m)	RMSE (m)
ResNet-18 pretrained	0.432 ± 0.094	172.7 ± 17.6	239.5 ± 23.7
ResNet-18 scratch	0.414 ± 0.127	169.5 ± 15.8	242.7 ± 28.4
EfficientNet-B0 pretrained	0.311 ± 0.109	201.4 ± 26.9	263.9 ± 26.3

Why This Matters

Domain shift is catastrophic: Models trained on one atmospheric regime fail completely on another (LOFO R² = −5.36); all 6 held-out flights produce negative R²
Validation methodology is critical: Pooled CV (R² = −2.05) substantially overestimates cross-regime performance relative to LOFO (R² = −5.36)
Few-shot adaptation works: 50 labeled samples from a target flight recovers R² = +0.35 (from −5.36)
Conformal prediction fails under shift: Cross-flight coverage = 34% (target: 90%); within-flight calibration recovers 90%
Feature engineering aids interpretation, not accuracy: 34 features vs 5 base: per-flight CV R² = −0.51 vs −2.04 (marginal, both negative)

Dataset

Paper 2 (ERA5 Tabular)

Samples: 5,500 ocean-only boundary-layer cloud observations
Flights: 6 flights across 2 campaigns
Features: 5 base ERA5 (t2m, d2m, sp, blh, tcwv) + 29 derived = 34 total
Target: CBH from CPL lidar (≤ 2 km, ocean only)

Flight	Campaign	Samples	CBH Mean (m)
Oct 23, 2024	WHySMIE	857	138
Oct 30, 2024	WHySMIE	1,808	941
Nov 4, 2024	WHySMIE	1,388	89
Feb 10, 2025	GLOVE	608	380
Feb 12, 2025	GLOVE	654	783
Feb 18, 2025	GLOVE	185	94

Paper 1 (Vision)

Samples: 380 from 7 flights (5-fold CV, 304 train / 76 val)
Input: 20×22 pixel thermal IR cutouts
Models: ResNet-18, EfficientNet-B0 (pretrained/scratch, with/without augmentation)

Domain Shift Analysis (Paper 2)

K-S Divergence (Oct 23 vs Feb 10)

14 of 34 features show K-S = 1.0 (completely non-overlapping distributions). Only solar angle features (sza, saa) show K-S = 0.0.

Adaptation Methods

Method	Mean R²	Assessment
No adaptation (LOFO)	−5.36	Baseline
Instance weighting (KNN)	−3.5	Marginal improvement
Instance weighting (density)	−5.5	Comparable to baseline
MMD alignment	−7.9	Worse
Feature selection	−6.9	Worse
TrAdaBoost	+0.04	Marginal positive
Few-shot (50 samples)	+0.35	Effective

Feature Importance (Full 34-Feature Model)

Feature	Importance
blh_sq	32.3%
blh	16.9%
stability_tcwv	8.0%
moisture_gradient	8.0%
blh_lcl_ratio	4.4%

Uncertainty Quantification

Method	Coverage	Target	Width (m)
Split conformal (cross-flight)	34%	90%	557
Per-flight calibration	90%	90%	538

Conformal prediction fails across flights due to exchangeability violations. Within-flight calibration recovers the 90% target.

Repository Structure

programDirectory/
├── preprint/                           # Both papers (LaTeX)
│   ├── paper1_nasa_er2_cbh.tex        # Paper 1: CNN vision
│   └── paper2_era5_domain_shift.tex   # Paper 2: ERA5 domain shift
├── scripts/
│   ├── paper2_rerun_v2.py             # Paper 2 reproducible pipeline
│   ├── feature_engineering.py         # Feature derivation
│   └── train_image_model.py           # Paper 1 training
├── results/
│   └── paper2_rerun_v2/               # Paper 2 v2 rerun results
├── outputs/
│   └── vision_baselines/reports/      # Paper 1 results (380-sample)
└── data/ (../data/)                   # CPL HDF5 flight data

Reproducibility

All experiments use np.random.seed(42) and are fully reproducible from raw data.

Paper 2 Rerun

# Requires ERA5 data on desktop at /mnt/two/research/NASA/ERA5_data_root/surface/
ssh desktop "cd /path/to/programDirectory && python3 -u scripts/paper2_rerun_v2.py"
# Results → results/paper2_rerun_v2/paper2_all_results_v2.json

Key Result Files

File	Contents
`results/paper2_rerun_v2/paper2_all_results_v2.json`	All Paper 2 metrics (v2 audit)
`outputs/vision_baselines/reports/*.json`	Paper 1 per-model results (380 samples)

Citation

@article{malarchick2026cbh_vision,
  title={CNN-Based Cloud Base Height Retrieval from Thermal Infrared Imagery: 
         Lessons from NASA ER-2 Observations},
  author={Malarchick, Rylan},
  year={2026}
}

@article{malarchick2026cbh_domain,
  title={Physics-Informed Feature Engineering and Domain Shift Challenges 
         for Atmospheric Machine Learning},
  author={Malarchick, Rylan},
  year={2026}
}

Acknowledgments

This work was conducted independently following the author's NASA OSTEM internship (May–August 2025) with NASA Goddard Space Flight Center. ERA5 data from ECMWF Copernicus Climate Data Store. CPL lidar data from NASA Goddard.

License

MIT License — See LICENSE for details.

Contact

Rylan Malarchick
Embry-Riddle Aeronautical University
Email: malarchr@my.erau.edu
GitHub: @rylanmalarchick

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
.github/workflows		.github/workflows
archive		archive
configs		configs
diagnostics		diagnostics
docs		docs
experiments/paper2		experiments/paper2
outputs		outputs
preprint		preprint
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTBIBLE_EFFECTIVENESS_LOG.md		AGENTBIBLE_EFFECTIVENESS_LOG.md
CloudML-Future-Work.md		CloudML-Future-Work.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cloud Base Height Retrieval from NASA ER-2 Airborne Observations

Two Papers

Paper 1: CNN-Based CBH from Thermal IR Imagery

Paper 2: ERA5 Feature Engineering & Domain Shift

Key Findings

Performance Summary (Paper 2)

Performance Summary (Paper 1)

Why This Matters

Dataset

Paper 2 (ERA5 Tabular)

Paper 1 (Vision)

Domain Shift Analysis (Paper 2)

K-S Divergence (Oct 23 vs Feb 10)

Adaptation Methods

Feature Importance (Full 34-Feature Model)

Uncertainty Quantification

Repository Structure

Reproducibility

Paper 2 Rerun

Key Result Files

Citation

Acknowledgments

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cloud Base Height Retrieval from NASA ER-2 Airborne Observations

Two Papers

Paper 1: CNN-Based CBH from Thermal IR Imagery

Paper 2: ERA5 Feature Engineering & Domain Shift

Key Findings

Performance Summary (Paper 2)

Performance Summary (Paper 1)

Why This Matters

Dataset

Paper 2 (ERA5 Tabular)

Paper 1 (Vision)

Domain Shift Analysis (Paper 2)

K-S Divergence (Oct 23 vs Feb 10)

Adaptation Methods

Feature Importance (Full 34-Feature Model)

Uncertainty Quantification

Repository Structure

Reproducibility

Paper 2 Rerun

Key Result Files

Citation

Acknowledgments

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages