Skip to content

refactor(stommel): use create_space_time_figure + add_magnitude_labels in build_desert_farm.py#24

Closed
madsCodeBuddy wants to merge 10 commits intomainfrom
refactor/build-desert-farm-package-api
Closed

refactor(stommel): use create_space_time_figure + add_magnitude_labels in build_desert_farm.py#24
madsCodeBuddy wants to merge 10 commits intomainfrom
refactor/build-desert-farm-package-api

Conversation

@madsCodeBuddy
Copy link
Copy Markdown
Collaborator

Summary

Conservative refactor of docs/build_desert_farm.py to call package functions for the parts that were duplicating package code, while preserving the parts that exist for UX-specific reasons.

Replaced with package calls

Figure setup (~15 lines → 9 lines + overrides). figure(...)
create_space_time_figure(space_on_x=False) plus per-attribute overrides for the figure-specific defaults (width 900, height 650, ranges 1e-3..1e13 / 1e-28..1e22, 11pt axis labels, 16pt bold title, #fafafa background, toolbar above, original 4-tool toolbar).

Reference grid (~28 lines → 1 line). The hand-rolled for time_val in TIME_MARKERS: ... for space_val in SPACE_MARKERS: ... loop with manually-positioned dashed Spans and Labels is replaced by add_magnitude_labels(p, font_size=LABEL_FONT_SIZE, space_on_x=False). Bonus: the package version has CustomJS callbacks that move the magnitude labels to follow the visible range edge on pan/zoom — the previous hand-rolled version anchored them at fixed positions.

Kept hand-rolled (with explanatory comment)

Manual ETL. Already calls package functions directly; no further consolidation possible without forfeiting Energy_type-based color assignment or losing the original Name column to create_name's ShortName fallback.

Patches + hover. add_processes produces patches but its data source has only x/y — no metadata for tooltips. Replacing the hand-rolled patches would lose the rich hover (Name, Scale, Energy, formatted ranges).

4-row energy-grouped legend. add_processes(category_col=, category_colors=) produces a 28-row legend (4 category headers + 24 per-process rows). The current 4-row hand-rolled version groups all glyphs of an energy type under one LegendItem so click-to-hide toggles the whole group at once. Materially better UX for a blog figure.

Verification

  • Refactored script runs: Built docs/desert_farm_stommel.html (219,746 bytes).
  • Spot-checked RuBisCO catalytic cyclex_coords range matches Time (1e-1..1e0), y_coords range matches Space (1e-27..1e-24). Boyd convention preserved through ETL.
  • Figure config: x-axis Time (s) at top, y-axis Space (m³), x_range = 1e-3..1e13, y_range = 1e-28..1e22. Toolbar tools = Pan/WheelZoom/BoxZoom/Reset (matches pre-refactor).

Stats

  • 368 → 346 lines (-22, -6%).
  • Diff: +37 / -59.

Dependency

Branched off PR #22 because it uses space_on_x=False through transform_process_response_sheet only via the manual ETL — actually this PR doesn't strictly depend on PR #22 (the script doesn't call transform_process_response_sheet at all). PR #22 should still merge first to fix the existing leverage-points HTML, but this refactor would also work on plain main. Will rebase if you'd prefer.

Possible follow-ups (not in this PR)

  • Add a hover_tooltips parameter to add_processes so future refactors of this kind can also replace the hand-rolled patches.
  • Add an add_compact_legend(category_col=) helper that produces the 4-row energy-grouped legend with click-to-hide.
  • Move the energy-type coloring into add_processes via a new color_col argument.

@madsCodeBuddy
Copy link
Copy Markdown
Collaborator Author

Updated to actually use transform_process_response_sheet. You called out that my justification for keeping the manual ETL was thin, and you were right.

What changed:

  1. etl.py: added n_points parameter to transform_process_response_sheet (default 1000 preserves existing behavior). Without this, ETL would have produced ~2 MB of HTML for the 24-process desert farm figure (10× the current 217 KB), since it hardcoded n_points=1000.
  2. docs/build_desert_farm.py: load_processes now calls transform_process_response_sheet(... space_on_x=False, n_points=100) and drops the imports of process_magnitude_column, classify_process_geometry, create_ellipse_data, set_fill_alpha. Two pre-ETL hops remain: pre-computing Color from Energy_type (ETL doesn't know domain colors), and renaming NameFullName so create_name's fallback doesn't clobber the original full name. Two post-ETL hops for label_x / label_y geometric centers.

Updated stats: 368 → 336 lines (-32, -9%). Same rendered output (219,746 B; matches previous refactor commit because the visible behavior is unchanged).

ETL bonus picked up for free: drops rows where Time_min > Time_max or Space_min > Space_max — data-quality guard the manual version skipped.

Wait remaining: ~30 hand-rolled lines for patches+hover and ~12 for the compact legend. Both still need package additions to be replaceable (noted in the PR description's follow-ups section).

Replace the hand-rolled figure() setup and the TIME_MARKERS/SPACE_MARKERS reference-grid loop with the package's create_space_time_figure(space_on_x=False) and add_magnitude_labels(p, space_on_x=False). Override the package defaults that don't fit this figure (smaller width/height, wider x/y ranges, tighter axis font sizes, custom title styling, custom background color, explicit tool set without SaveTool/HelpTool).

Bonus: add_magnitude_labels has CustomJS sticky callbacks so the magnitude labels follow the visible edge on pan/zoom — better than the previous hand-rolled labels which were anchored at fixed positions.

Keep hand-rolled the manual ETL, patches+hover, and 4-row energy legend, documented in a comment block. add_processes(category_col=) would produce a 28-row legend and lose the hover.

Net: 368 → 346 lines (-22), no behavior change visible to readers.
ETL was hardcoding n_points=1000 (the create_ellipse_data default), which produces 2000 vertices per ellipse. Fine for figures with a few ellipses, but on the 24-process desert farm figure it bloats the rendered HTML 10x (217 KB → ~2 MB). Adding n_points lets callers tune density. Default 1000 preserves existing behavior for the leverage-points plot and any other current caller; build_desert_farm.py passes 100.
…_sheet

The previous load_processes() was duplicating the ETL pipeline: process_magnitude_column, classify_process_geometry, create_ellipse_data, set_fill_alpha. Replace with transform_process_response_sheet(... space_on_x=False, n_points=100). Two pre-ETL hops needed:
- Pre-compute Color from Energy_type (ETL doesn't know domain colors).
- Rename Name → FullName so create_name's ShortName fallback doesn't   clobber the original full name (used for hover tooltip).
Plus two post-ETL hops for label_x/label_y geometric centers (also not in ETL).

Drops imports of process_magnitude_column, classify_process_geometry, create_ellipse_data, set_fill_alpha — all called via ETL now.

ETL bonus: filters rows where Time_min > Time_max or Space_min > Space_max (data quality guard; the manual version skipped this).

Net: 368 → 336 lines (-32, -9%). Output identical to pre-refactor (219,746 B vs ~217 KB previously; +3 KB is sticky-pan callbacks from add_magnitude_labels).
…nges)

label_x = sqrt(Time_min * Time_max), label_y = sqrt(Space_min * Space_max). These are the visually-centered positions on a log axis, useful as label anchor points. If a CSV provides label_x or label_y directly, ETL preserves those values rather than overwriting.
…tool imports

label_x and label_y now come from ETL. Move PanTool / WheelZoomTool / BoxZoomTool / ResetTool imports to module level. Strip change-history comments from prior refactors.
Covers: column production, unit application, inverted-range row filtering, space_on_x orientation (Stommel default and Boyd), n_points vertex count, label_x/label_y geometric-mean computation, and CSV override preservation for label_x/label_y. 10 new tests, all passing.
@madsCodeBuddy madsCodeBuddy force-pushed the refactor/build-desert-farm-package-api branch from e3a8519 to 9e0de58 Compare April 28, 2026 07:04
@madsCodeBuddy
Copy link
Copy Markdown
Collaborator Author

Closing — branch is diverged (ahead=10, behind=10) and per branch hygiene rule the right move is recreate from current main rather than rebase via API. Replacement PR coming on a fresh branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant