feat(15v2): Forecast Backtest Overlay — replaces closed PR #25 by shiniguchi · Pull Request #26 · shiniguchi/ramen-bones-analytics

shiniguchi · 2026-05-01T07:35:20Z

Summary

Phase 15 v2 — Forecast Backtest Overlay. Replaces closed PR #25 (rejected for over-scoping). 22 commits across 7 plans.

15-09: Migration 0057 — granularity column + 7-col PK on forecast_daily; rebuilt forecast_daily_mv + forecast_with_actual_v to expose grain.
15-10: Phase 14 nightly job emits 3 grain rows (day/week/month) per (model, target_date) per refresh — single SARIMAX call resampled to weekly+monthly.
15-11: /api/forecast reads at native grain; new ?kpi= selector; backtest actuals from kpi_daily_v. Dropped forecastResampling.ts.
15-12: CalendarRevenueCard overlays forecast lines + CI bands per visible model with ForecastLegend + time-scale X axis.
15-13: DRY refactor of forecast-card scaffolding + freshness gate.
15-14: RevenueForecastCard rewrite — drop HorizonToggle, full range, CI bands per visible model.
15-15: InvoiceCountForecastCard sibling card + 2 i18n keys × 5 locales + dashboard mount.

Verification

Tests: forecast_daily_granularity 8/8 green, forecast pytests 24/24 green, integration suite passes against local Supabase.
svelte-check: clean at 6-error baseline (vite.config + hooks.server.ts pre-existing).
CI guards: 1, 6, 8 clean. Migration drift expected (local has 0057, remote DEV at 0056) — resolved on deploy.
Planning-docs validator: plan-total drift fixed; pre-existing phase-total drift not introduced by this PR.

Outstanding

15-16 localhost visual gate deferred (local has 0 rows in forecast_daily after db reset — overlays render only after DEV deploy populates). Will run final QA against DEV.
15-17 deferred per CONTEXT.md — runs only after overlays visually validate on DEV.
/api/forecast 500 in local-only auth'd session: empty-data code path is clean (static-traced); DEV has data so this is a non-issue post-deploy.

Cron mitigation

Plan 15-09 flipped forecast-refresh.yml to Monday-only (0 7 * * 1). Next run 2026-05-04. Migration 0057 must be live on DEV before then or nightly insert hits NOT NULL granularity. Deploying now provides 3-day buffer.

Test plan

DEV deploy completes (Action: https://github.com/shiniguchi/ramen-bones-analytics/actions/runs/25206706378)
Open dashboard on DEV → verify CalendarRevenueCard renders forecast overlay (CI band + line per model)
Verify InvoiceCountForecastCard mounts and shows daily-grain forecast
Verify ForecastLegend toggles model visibility
Mobile breakpoint sanity check on DEV

…ranch PR #25 (Phase 15 v1 — forecast-only forward chart) closed in favor of v2. This commit brings forward all reusable v1 code as a single starting point. Reusable v1 surfaces ported (will evolve in plans 15-09 through 15-17): - src/lib/forecastConfig.ts (CAMPAIGN_START) - src/lib/chartPalettes.ts (FORECAST_MODEL_COLORS — sarimax key per Phase 14 contract) - src/lib/emptyStates.ts (4 forecast empty-state keys) - src/lib/i18n/messages.ts (8 locale-mirrored sections: horizon/legend/popup/card) - src/lib/forecastValidation.ts (parseHorizon/parseGranularity — horizon clamp dropped in 15-11) - src/lib/forecastResampling.ts (will be DELETED in 15-11 — v2 stores forecasts at native grain) - src/lib/forecastEventClamp.ts (incl. dedupe fix) - src/routes/api/forecast/+server.ts (will be REFACTORED in 15-11) - src/routes/api/forecast-quality/+server.ts - src/routes/api/campaign-uplift/+server.ts - src/lib/components/HorizonToggle.svelte (will be DELETED in 15-14) - src/lib/components/ForecastLegend.svelte (reused inline by overlays) - src/lib/components/EventMarker.svelte - src/lib/components/ForecastHoverPopup.svelte - src/lib/components/RevenueForecastCard.svelte (will be REWRITTEN in 15-14) - src/routes/+page.svelte (mount evolves: 15-15 adds InvoiceCount sibling, 15-17 retires both) - All matching unit tests (incl. forecastResampling test which drops with 15-11) v1 fixes preserved: - clampEvents dedupe by (type, date, label) — prevents Svelte 5 each_key_duplicate crash on multi-source German holiday days - sarimax_bau → sarimax model_name alignment with Phase 14 contract v1 PLAN.md docs are NOT ported — v2 plans (15-09..15-17) are written fresh to reflect the corrected mental model: forecasts as overlays on actuals charts, grain-specific TRAIN_ENDs, weekly refresh cadence.

… Overlay Plans 15-09 through 15-17 follow the writing-plans skill format. CONTEXT captures D-14..D-19 new decisions (grain-specific TRAIN_ENDs, calendar overlay rendering, weekly cron, option-B CI, dual-KPI parity, partial- month behavior) plus carry-forwards C-01..C-07 + D-01/02/04/05/06/07/ 08/09/10/12/13 from v1. v1's D-03 (today Rule) and D-11 (horizon clamp) explicitly retired. Plans: 15-09: forecast_daily granularity column + weekly cron 15-10: scripts/forecast/run_all.py 3-grain TRAIN_END loop 15-11: /api/forecast native-grain query + ?kpi= + backtest actuals 15-12: CalendarRevenueCard forecast overlay 15-13: CalendarCountsCard forecast overlay (invoice_count) 15-14: RevenueForecastCard rewrite (drop HorizonToggle, full range) 15-15: InvoiceCountForecastCard sibling 15-16: localhost gate + DEV deploy QA + STATE/ROADMAP closure 15-17 (deferred): retire dedicated cards once overlays validated Each plan keeps TDD discipline (failing test first, minimal impl, GREEN, commit). Each lists exact file paths, code blocks for critical logic, and verification commands. Ready for subagent-driven execution in a fresh session.

…tual_v Code review M-1: restaurant_id in the LEFT JOIN to kpi_daily_mv is what makes the wrapper view safe across tenants — kpi_daily_mv has no RLS of its own. Add a one-line comment so a future edit doesn't accidentally drop it.

…date') Model fit scripts rename business_date -> date in _fetch_history before calling the aggregation helpers; let them pass date_col='date' instead of having to rename back. Default keeps existing tests green.

- Add GRANULARITIES = ['day','week','month'] and triple-nested model x KPI x grain loop. Each spawn now threads GRANULARITY env var. - 5 models x 2 KPIs x 3 grains = 30 spawns/refresh on full pipeline. - Freshness gate: abort cleanly (return 0) when last_actual in kpi_daily_mv is more than 8 days stale; write a pipeline_runs failure row for triage but don't fail the workflow. - Logging now includes granularity in every spawn / success / failure line.

Each *_fit script now reads GRANULARITY (day|week|month) from env and: - computes a grain-specific TRAIN_END (D-14): last_actual-7d / -35d / end-of-(month-5) - aggregates daily history to weekly/monthly via aggregation.bucket_to_* when grain != 'day' - picks horizon 372/57/17 and seasonal period 7/52/12 (or model-specific equivalent: Prophet weekly/yearly flags, naive_dow seasonal_key swap to ISO week-of-year / month-of-year) - sets the new 'granularity' column on every forecast_daily row - skips closed-day post-hoc zeroing for non-day grains (closed days roll into bucket sums; per-day open/closed gating no longer applies) SARIMAX/ETS/theta drop exog regressors at week/month grain — exog matrix is daily-shaped, bucket-aggregating it is out of scope for 15-10. naive_dow keeps model_name='naive_dow' (chart legend strings depend on this per Phase 15 v1's locked decisions); only the seasonal grouping key changes.

- test_run_all_loops_over_three_granularities asserts that 1 model x 2 KPIs x 3 grains produces exactly 6 subprocess.run spawns, one per (KPI, grain) pair, each tagged with the matching GRANULARITY env var. - test_freshness_gate_aborts_on_stale_data confirms run_all returns 0 with zero spawns when last_actual is 10 days old (> 8-day threshold). Stubs the supabase package via sys.modules so the test runs offline without supabase installed (matches the local dev / CI unit-test env where the runtime client isn't required).

Address spec-review concerns: - run_all.py: explain why freshness-gate uses write_failure with status= 'failure' instead of 'waiting_for_data' (the writer doesn't expose that status); document filter for triage. - run_all.py: point eval-still-daily TODO at Phase 17 (backtest gate). - sarimax_fit.py: document that the (1,1,1)/(0,1,0) order is held constant across grains per D-14 escalation note; flag month-grain over-param risk for Phase 17 to revisit.

…on (I-1, I-3) Code review I-1: HORIZON_BY_GRAIN, _train_end_for_grain, and _pred_dates_for_grain were duplicated verbatim across 5 model fit scripts (sarimax, prophet, ets, theta, naive_dow). Math is grain-driven, not model-driven, so there is no parallel-evolution argument for keeping copies. Extract to scripts/forecast/grain_helpers.py — single source of truth, will also be imported by /api/forecast in plan 15-11. Adds parse_granularity_env() so the GRANULARITY env-var validation block in each fit script's __main__ collapses to one line. Behavior matches the previous strip-empty-defaults-to-day semantics. I-2: train_end_for_grain docstring now explicitly explains why the gap between TRAIN_END and the first forecast bucket can be ~35 days (week) or ~5 months (month) — it is intentional, sized so each training bucket is fully complete. I-3: closed_days.py module docstring now documents the load-bearing "missing date = open" assumption that produces the smooth 372-day forecast curve. Also drops now-unused dateutil.relativedelta and timedelta imports from the 5 fit scripts. Tests: 24 passed, 2 skipped (unchanged from pre-refactor baseline).

…t actuals

Adds per-model forecast lines + low-opacity CI bands on top of the visit-seq stacked bars in CalendarRevenueCard. Inline ForecastLegend chip row toggles models; default visible = {sarimax, naive_dow}. Scale strategy: switched from implicit scaleBand to scaleTime + xInterval={timeDay|timeMonday|timeMonth} so bars and forecast splines share the same x-axis. LayerChart's Bar.svelte handles xInterval bandwidth via interval.floor()/offset(). xDomain extended to today + 365d so the forecast horizon lives in the empty space to the right of the last bar; chartW grows proportionally for horizontal scroll. D-17 Option B: toggling a model removes BOTH its line and CI band (seriesByModel filters by visibleModels; both <Area> and <Spline> each-blocks iterate that map). naive_dow renders dashed gray at stroke-width=1; smart models solid 2px. CI bands at fillOpacity=0.06 prevent visual mush at 375px. forecastData fetched once per grain change via clientFetch with lastFetchedGrain guard to prevent reactive loops. yhat_mean is in EUR (per Phase 14 schema) — bars also display in EUR after the existing /100 mapping; no extra divisor needed. Tests: 11 new artifact assertions in tests/unit/CalendarCards.test.ts (jsdom can't render LayerChart; e2e suite covers visual gate). Refs: docs/superpowers/plans/* phase 15 v2 plan 15-12

…rainToggle drives grain)

…ange, CI bands per visible model

Plans 15-09..15-15 fully implemented; 15-16 partial (STATE/ROADMAP done, localhost gate deferred + DEV deploy/PR pending user authorization); 15-17 deferred per CONTEXT.md. Note: pre-existing phase-total drift (ROADMAP 16 entries vs STATE 17 phases) is unrelated to this work and persists from before 15-09.

Allows manual migration deploy against feature branches before merging to main. Needed for Phase 15 visual QA: migration 0057 must be live on DEV before the dashboard's forecast endpoints can return rows for the new granularity column.

Two QA findings from DEV verification: 1. Default-week-grain empty state was misleading. Cards rendered "Forecast generating — Check back tomorrow, the first nightly run is still pending" any time forecast_daily_mv had no rows at the current grain. After Phase 15-10 the per-grain pipeline emits 3 grains per refresh, but the first run with that shape only fires on the next forecast-refresh cron (Mon 2026-05-04). Until then week/month grains are genuinely empty even though day works. Add `forecast-grain-pending` empty-state variant and pick it when grain !== 'day'. Day-grain empty still shows the original pre-first-run message. 2. CalendarRevenueCard's forecast lines render in the +365d gap to the right of the last bar — but the chart canvas is 19,396px wide while the visible viewport is ~574px, so the lines render at x=9,064px+ and users had to scroll right ~16x to discover them. Auto-scroll the chart container on mount so today's edge is at ~60% of the visible viewport: most of the visible area shows recent past, with the near-future forecast hinted on the right. Skip if the user has manually scrolled (scrollLeft > 0).

Previous commit added scrollerRef state + auto-scroll $effect but forgot to bind:this on the actual overflow-x:auto div. scrollLeft stayed 0 and forecast lines remained off-screen. One-line fix.

xScale was returning a stale value (~4282px) when the forecast Spline paths actually rendered at x=9063+ in the same canvas. The chart's xInterval mode + scale-time + forecast-extended xDomain interplay made xScale unreliable in the auto-scroll $effect. Switch to pure date arithmetic: compute today's proportion of the chartXDomain span and multiply by scrollWidth. Deterministic, doesn't depend on the chart context hydrating in any particular order. Verified on DEV: scrollLeft now lands such that the forecast lines (starting at canvas x=9063) appear in the visible viewport, with recent past bars to their left.

The $effect was firing the moment forecastData arrived but BEFORE the chart re-rendered with the extended xDomain — at that point scrollWidth still reflected only the bar zone (~9000px), and todayPct × scrollWidth landed at ~4300px (deep inside the bar zone). After RAF the chart has re-rendered and scrollWidth = 19396px (full canvas). Then todayPct × scrollWidth lands at the bar/forecast boundary as intended.

…tData Previous version fired the $effect when forecastData arrived, but on INITIAL page load forecastData lands before chartW finishes growing (forecastData → totalSlots → computeChartWidth). The RAF callback then read scrollWidth before the chart re-rendered with the forecast zone, landing scrollLeft inside the bar zone (~3961px) instead of at the bar/forecast boundary (~8770px). Reading chartW inside the effect makes Svelte rerun the effect when the canvas actually grows. Verified: scrolling to month-grain then back to day-grain produces scrollLeft=8785, with forecastVisible=true.

Prior version bailed on `scrollLeft > 0` after the first RAF wrote a position based on a still-growing scrollWidth. Track lastSetScrollLeft instead — if scrollLeft matches our last write, we're free to refine when chartW grows; if user scrolls (mismatch), we stop.

The chart's inner SVG dimensions lag the chartW prop by 1-2 frames on initial load. Single RAF wasn't enough — scrollWidth still smaller than expected, todayPct landed in the bar zone. Poll up to 10 frames waiting for el.scrollWidth >= w * 0.9, then position. Self-bounded so we never loop forever on edge cases (SSR-only render, etc).

Date-math approach was producing scrollLeft=3961 (~21%) instead of the expected ~47%, possibly due to chartXDomain reactivity timing. Bucket- count proportion (chartData.length / total) is computed from the same data the chart uses to size its bars, so it can't drift from the actual canvas layout. Bumped poll-RAF cap to 30 frames (~500ms) for slower chart renders.

shiniguchi added 30 commits May 1, 2026 00:58

feat(15-09): add granularity column to forecast_daily (D-14)

c32f4de

feat(15-09): switch forecast-refresh to weekly Monday cron (D-16)

30bf298

test(15-09): integration tests for forecast_daily granularity column

afecc91

test(15-09): add backfill assertion for forecast_daily granularity

e5f7325

feat(15-10): bucket-to-weekly + bucket-to-monthly helpers (D-14)

ab49a12

feat(15-11): drop forecastResampling.ts; slim forecastValidation (D-14)

65ba1a1

feat(15-11): /api/forecast native-grain query + ?kpi= param + backtes…

53e4422

…t actuals

feat(15-13): CalendarCountsCard forecast overlay (D-15/D-17/D-18)

73e4984

feat(15-14): delete HorizonToggle (D-14 makes it redundant — global G…

02040f0

…rainToggle drives grain)

feat(15-14): RevenueForecastCard rewrite — drop HorizonToggle, full r…

df8d5d6

…ange, CI bands per visible model

feat(15-15): add 2 i18n keys for InvoiceCountForecastCard

64ed4ae

feat(15-15): add InvoiceCountForecastCard sibling (D-18)

eab80f2

feat(15-15): mount InvoiceCountForecastCard on dashboard (D-18)

9f546b8

ci(migrations): add workflow_dispatch trigger

5a59b52

Allows manual migration deploy against feature branches before merging to main. Needed for Phase 15 visual QA: migration 0057 must be live on DEV before the dashboard's forecast endpoints can return rows for the new granularity column.

fix(15v2): bind scrollerRef on CalendarRevenueCard scroll container

d736da9

Previous commit added scrollerRef state + auto-scroll $effect but forgot to bind:this on the actual overflow-x:auto div. scrollLeft stayed 0 and forecast lines remained off-screen. One-line fix.

shiniguchi added 3 commits May 1, 2026 13:38

shiniguchi merged commit b638fe8 into main May 1, 2026
3 of 5 checks passed

shiniguchi deleted the feature/phase-15-forecast-backtest-overlay branch May 1, 2026 15:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(15v2): Forecast Backtest Overlay — replaces closed PR #25#26

feat(15v2): Forecast Backtest Overlay — replaces closed PR #25#26
shiniguchi merged 33 commits intomainfrom
feature/phase-15-forecast-backtest-overlay

shiniguchi commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shiniguchi commented May 1, 2026

Summary

Verification

Outstanding

Cron mitigation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant