Consolidate bed picks / layer / morton methods and documentation by espg · Pull Request #96 · englacial/xopr

espg · 2026-04-28T23:27:37Z

Addresses #94 and #95 along with a few other quality of life enhancements. Mostly aimed at improving documentation and centralizing related methods.

espg · 2026-04-28T23:28:47Z

This also handles our bedmachine document notebook correctly-- locally rendering it, and then flagging to use the cached output rather than rerunning (which will fail with the earth access login requirement).

espg · 2026-04-28T23:30:44Z

/preview

github-actions · 2026-04-28T23:30:52Z

🚀 Starting documentation preview deployment for commit 284cb48...

This may take a few minutes. I'll update this comment when ready.

espg · 2026-04-29T00:17:48Z

Heads-up: `app-id` deprecation in `actions/create-github-app-token@v3`

This PR bumps actions/create-github-app-token@v2 → v3 for Node 24 compatibility. The bump is non-breaking for our current usage, but worth flagging a soft deprecation that we should migrate before they remove the old input.

What changed

v3.1.0 introduced a new client-id input and deprecated app-id (release notes, PR #353). Both inputs still work in v3 — app-id just emits a deprecation warning in the runner logs and may be removed in a future major version.

What we use today

Two workflows authenticate via the release bot app token:

.github/workflows/publish.yml (line 30): app-id: ${{ vars.RELEASE_APP_ID }}
.github/workflows/update-km-badge.yml (line 22): app-id: ${{ vars.RELEASE_APP_ID }}

Both reference the repo variable vars.RELEASE_APP_ID.

Migration path (when ready)

Find the GitHub App's Client ID in the App's settings page (Settings → Developer settings → GitHub Apps → englacial-release-bot → General → Client ID). It looks like Iv1.abc123... or Iv23li....
Add a new repo variable RELEASE_APP_CLIENT_ID with that value (keep RELEASE_APP_ID around for now in case rollback is needed).
Swap app-id: ${{ vars.RELEASE_APP_ID }} → client-id: ${{ vars.RELEASE_APP_CLIENT_ID }} in both workflow files.
Once both workflows have run successfully against client-id, drop the old RELEASE_APP_ID variable.

This is out of scope for this PR — recording it here so it doesn't get lost. Suggest tracking as a small follow-up issue, since it's a one-line workflow change plus one new repo variable.

…ne depreciation cutoff)

…uild again

github-actions · 2026-04-29T22:57:00Z

📚 Documentation Preview Ready!

🌐 Live Preview

URL: https://xopr-pr-96.surge.sh/xopr/

Preview updates automatically with new commits while the preview-docs label is present.

📦 Download Artifact

You can also download the docs for local viewing.

Commit: 8f4a9ab

To trigger a preview, add the preview-docs label or use the /preview command.

…rmation in conduct.md

…ebook call

espg · 2026-04-29T23:54:19Z

Test infrastructure audit — findings + phased plan

After the load_bed_picks regression earlier in this PR (a hand-rolled fake catalog GDF was missing the frame column the production code reads, leading to a CI-stage failure that unit tests didn't catch), I ran a 4-agent parallel audit of the entire test suite to find similar over-mocked patterns. Full report attached
(test_audit.md); summary below.

Headline numbers

Metric	Count	%
Total tests audited	~155	—
Pure-logic (no I/O)	~94	~61%
Real-data tests (already correct)	~24	~16%
Mocked, but appropriately scoped (error paths, control flow, per-frame fetches)	~25	~16%
Mocked, fabricates production schema (drift hazard)	~30 across 7 fixture clusters	~19%

Hot-spot fixture clusters

The 19% drift surface concentrates in fewer than a dozen helper functions, ranked by leverage:

stac/tests/common.py::create_mock_dataset — fake xr.Dataset standing in for radar .mat (~9 tests)
tests/test_matching.py::_granule_from_bbox — fake CMR UMM granules (~10 tests)
bedmap/tests/test_morton_match.py::_make_stac_gdf — fake STAC catalog (6 tests; same shape as the regression we just patched)
tests/test_layer_converter.py::_make_layer_ds — fake layer xr.Dataset (~6 tests)
bedmap/tests/test_bedmap.py::TestQueryBedmapCachedCollections._mock_catalog — fake bedmap catalog (5 tests)
tests/test_stac_cache.py S3-listing tests — hand-rolled S3 XML (4 tests)
bedmap/tests/test_bedmap.py::test_build_bedmap_geoparquet_catalog — hand-rolled metadata payload (1 test, encodes a non-trivial converter contract)

What's already correct (don't refactor)

All of qc/tests/ — the synthetic xr.Dataset fixtures use deliberately-controlled values (zero headings, exact bed-pick brightness) where real data would just add uncontrolled variance. More diagnostic, not less.
tests/test_matlab_char_encoding.py synthetic HDF5 fixtures — they target specific Unicode bytecodes that real files can't reproduce on demand.
stac/tests/test_morton.py — pure morton math on inline GeoJSON dicts (the dicts are the production shape).
tests/test_opr_tools.py::compute_crossover_error — pure functions on shapely Points.
All filesystem-and-path mocks in test_stac_cache.py (~12 tests).
All sync-flag mocks in test_opr_access.py (verifying boolean toggles).
All per-frame OPRConnection.get_layers mocks — each call is one HTTP request, mocking is legitimate.

Migration plan

All phases follow the test_opr_access.py reference pattern: @pytest.fixture(scope='session') + @pytest.mark.slow. No new dependencies. Net LOC ≈ zero.

Phase	Scope	Effort	Tests touched	Risk caught
A	Replace `_make_stac_gdf`, `TestQueryBedmapCachedCollections._mock_catalog`, `test_build_bedmap_geoparquet_catalog` metadata literal	~½ day	12	Catalog row-shape drift (the exact `load_bed_picks` regression class)
B	Add `real_small_frame` fixture, migrate 6 contract tests in `test_metadata.py` + 2 `*_end_to_end` tests in `test_integration.py`	~½ day	8	Dataset attr-name / `param_records` / `slow_time` shape drift
C	Add `real_cmr_granules` live-smoke fixture; cross-validate STRtree vs prefix backends on real CMR data. Keep both the synthetic fast tests and the new live ones	~2 hr	+2 new	CMR UMM schema drift
D	Real S3-listing fixture; companion tests for `test_parses_xml`, `test_handles_pagination`, `test_downloads_new_files`. Keep both.	~1 hr	+3 new	S3 protocol-format drift
E	Replace `_make_layer_ds` with real layer fixture (sliced to a few rows)	~1 hr	6	layer-Dataset schema drift
F	Rename misleading `test_*_end_to_end` tests in `test_integration.py` (they patch `extract_item_metadata` mid-flight, so they're not actually end-to-end). Coordinate with Phase B since both touch the same file.	~30 min	2	Test-name truth-in-advertising
G	Marker-hygiene sweep — several tests hit real network without `@pytest.mark.slow` (`test_basic_load_data`, `test_ops_api`, `test_bench_walltime`, parts of `test_opr_access`). Apply markers consistently per `pyproject.toml` so `pytest -m \"not slow\"` actually delivers a fast offline run.	~1 hr	~25	Marker policy now matches reality

Total scope: ~6–9 hours across 7 phases. ~25 tests migrated from synthetic to real, ~5 added (live-network smoke companions), ~5 new session fixtures.

Why phase A first

It's the smallest blast radius (12 tests, 3 changes), needs no new infra beyond what test_opr_access.py established this PR, and addresses the exact regression class that motivated the audit. Starting on it now.

/* thanks claude for crawling this slug of test drudgery...

phase 1 changes on api and tests consolidation

7e46f7d

phase 2 changes (notebook updates)

284cb48

espg added the preview-docs label Apr 28, 2026

ruff fixes, caching earth access outputs for bedmachine notebook

d1bf941

espg mentioned this pull request Apr 28, 2026

Determine overlap between Bedmap and OPR data #69

Open

narrative docs clarification / update, fixing #94 (workflow)

0c4c45d

espg added 3 commits April 28, 2026 17:25

ruff and node fixes for all of our workflows (getting ahead of the Ju…

a0844af

…ne depreciation cutoff)

(hopefully) fixing our problematic bedmachine notebook so that docs b…

0bf8631

…uild again

removing new notebook to see if docs build cleanly

9713c02

espg added 2 commits April 29, 2026 16:12

fixing example notebook to fit in CI/CL limits, updating contact info…

fffc9b7

…rmation in conduct.md

pointing unit tests to real data; conforming library api to match not…

6990269

…ebook call

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate bed picks / layer / morton methods and documentation#96

Consolidate bed picks / layer / morton methods and documentation#96
espg wants to merge 9 commits into
mainfrom
api_bed_morton

espg commented Apr 28, 2026

Uh oh!

espg commented Apr 28, 2026

Uh oh!

espg commented Apr 28, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

espg commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

espg commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

espg commented Apr 28, 2026

Uh oh!

espg commented Apr 28, 2026

Uh oh!

espg commented Apr 28, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

espg commented Apr 29, 2026

Heads-up: app-id deprecation in actions/create-github-app-token@v3

What changed

What we use today

Migration path (when ready)

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📚 Documentation Preview Ready!

🌐 Live Preview

📦 Download Artifact

Uh oh!

espg commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test infrastructure audit — findings + phased plan

Headline numbers

Hot-spot fixture clusters

What's already correct (don't refactor)

Migration plan

Why phase A first

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Heads-up: `app-id` deprecation in `actions/create-github-app-token@v3`

github-actions Bot commented Apr 29, 2026 •

edited

Loading

espg commented Apr 29, 2026 •

edited

Loading