Problem
crates/astrodyn_verif_jeod/test_data/baselines.json is a manual tier3_report --freeze-baselines artifact — CI never refreshes it. The tier3_baseline_diff gate fails only on regressions of entries already present; a brand-new tier3_ test with no baseline entry is reported as an informational new and passes.
There is no forcing function keeping the frozen set in sync with the actual tier3_ test set, so every tier3 test added since the last freeze silently accumulates unguarded against future regression.
This was the exact drift #627 had to clean up by hand: the frozen set had fallen behind the passing-test set, and two tests (tier3_sim_drag_ver_const from #621, tier3_simulation_tide_run02 from #625) had landed on main with no baseline coverage. It will recur.
Proposed fix
Make tier3_baseline_diff (or a thin CI step) fail when a tier3_ test runs but has no baseline entry, instead of reporting it as a passing new notice. A test legitimately excluded from a CI lane (e.g. the earth_moon suite on the fast bucket) is already handled by .github/tier3-allow-missing.txt, so the new failure mode would be: ran, not allow-listed, not in baselines → fail with "refreeze with `tier3_report --freeze-baselines`".
Acceptance
- A new
tier3_ test merged without a baseline entry turns the relevant CI lane red, with a message pointing at the refreeze command.
- Allow-missing entries continue to be honored (no false failures for intentionally-excluded lanes).
Context
Follow-up from #627 (see its "Follow-up worth considering" section).
Problem
crates/astrodyn_verif_jeod/test_data/baselines.jsonis a manualtier3_report --freeze-baselinesartifact — CI never refreshes it. Thetier3_baseline_diffgate fails only on regressions of entries already present; a brand-newtier3_test with no baseline entry is reported as an informationalnewand passes.There is no forcing function keeping the frozen set in sync with the actual
tier3_test set, so every tier3 test added since the last freeze silently accumulates unguarded against future regression.This was the exact drift #627 had to clean up by hand: the frozen set had fallen behind the passing-test set, and two tests (
tier3_sim_drag_ver_constfrom #621,tier3_simulation_tide_run02from #625) had landed on main with no baseline coverage. It will recur.Proposed fix
Make
tier3_baseline_diff(or a thin CI step) fail when atier3_test runs but has no baseline entry, instead of reporting it as a passingnewnotice. A test legitimately excluded from a CI lane (e.g. theearth_moonsuite on the fast bucket) is already handled by.github/tier3-allow-missing.txt, so the new failure mode would be: ran, not allow-listed, not in baselines → fail with "refreeze with `tier3_report --freeze-baselines`".Acceptance
tier3_test merged without a baseline entry turns the relevant CI lane red, with a message pointing at the refreeze command.Context
Follow-up from #627 (see its "Follow-up worth considering" section).