Skip to content

Add make build-docs target; align build_explorer.py with package ETL#33

Open
madsCodeBuddy wants to merge 2 commits intomainfrom
feat/build-docs-target
Open

Add make build-docs target; align build_explorer.py with package ETL#33
madsCodeBuddy wants to merge 2 commits intomainfrom
feat/build-docs-target

Conversation

@madsCodeBuddy
Copy link
Copy Markdown
Collaborator

@madsCodeBuddy madsCodeBuddy commented Apr 29, 2026

Summary

Adds a make build-docs target. Brings docs/build_explorer.py fully up-to-date with the current package — its hand-rolled data loading had drifted, and the script wasn't actually runnable end-to-end against the current repo layout.

Commits

1. Add make build-docs target; fix build_explorer.py end-to-end

  • Makefile: new build-docs target running build_desert_farm.py, build_desert_farm_summary.py, build_explorer.py.
  • docs/build_explorer.py — two latent bugs surfaced trying to run end-to-end:
    • Default CSV path was data/local_data/time_space_reference_objects.csv from before the data reorg. Now points at data/datasets/.
    • Ellipse coord assignment was missing .rename(columns={0: "x_coords", 1: "y_coords"}) after result_type="expand" — without it, pandas alignment-by-name was producing NaN coords and _patch_coords was crashing with AttributeError: 'float' object has no attribute 'tolist'. Patched inline as a minimal fix.
  • docs/*.html: regenerated to verify the target works.

2. Align build_explorer.py with package ETL pipeline

Replaces the hand-rolled load_reference_objects with a thin adapter around transform_process_response_sheet, mirroring PR #30's pattern for build_desert_farm.py. The minimal .rename fix in commit 1 was a band-aid; this is the proper fix — the underlying issue was duplicated-and-stale code.

Adapter pattern (matches load_processes in build_desert_farm.py):

  • Rename Name → FullName so create_name's ShortName fallback inside the ETL doesn't overwrite the descriptive name used for hover and labels.
  • Map Category → Color (uppercase to match POSSIBLE_COL_LIST).
  • Set ShortName = FullName (reference objects don't have separate short forms, but create_name needs the column to exist).

Things the script was missing vs the package ETL:

  • Filtering rows where Time_min > Time_max or Space_min > Space_max (current CSV has no such rows, but the safety net is there now)
  • FillAlpha / TextAlpha columns (script uses fixed 0.0 for hidden state, so these are unused but harmless)
  • Numpy dependency (only used for the manual label_x / label_y computation, which the package ETL now handles)

Behavior verification: explorer.html is byte-for-byte identical to the output from commit 1 — 688,092 B, 159 content-bearing coord lists. Pure code-quality change.

Note on the size delta vs prior committed explorer.html

The regenerated explorer.html is 688 KB (159 coord lists) vs the previously-committed 880 KB (213 coord lists). Investigated this in the second commit — the size drop is not caused by the refactor (refactored output is identical to the v1 single-fix output). Most likely the prior committed file was generated under different conditions (older n_points, older create_ellipse_data, or older CSV). Squashed history makes this unrecoverable. Worth a visual check before merge.

Branch hygiene

compare/main...feat/build-docs-target: status=ahead, ahead_by=2, behind_by=0

- Makefile: new 'build-docs' target that regenerates all three
  docs/*.html artifacts from build_desert_farm.py,
  build_desert_farm_summary.py, and build_explorer.py.

- docs/build_explorer.py: two latent bugs that surfaced when
  attempting to actually run the script end-to-end:
  (1) Default CSV path was 'data/local_data/time_space_reference_objects.csv'
      but the canonical file moved to 'data/datasets/' during the
      data reorg. Updated default to point at canonical location.
  (2) Ellipse coord assignment was missing the
      .rename(columns={0: 'x_coords', 1: 'y_coords'}) step that
      etl.transform_process_response_sheet uses; without it,
      pandas alignment fails and ellipse rows get NaN coords,
      causing AttributeError downstream. Added the rename to
      mirror the etl.py pattern.

- docs/*.html: regenerated via 'make build-docs' to verify the
  target works end-to-end.

Note: regenerated explorer.html is 688KB vs the prior 880KB
committed file (213 -> 159 coord lists). The script now runs
cleanly but visual verification is recommended; the size
difference may indicate the prior file was generated under
different conditions (older CSV, different package version, etc).
Replaces the hand-rolled load_reference_objects with a thin adapter
around timeSpace.etl.transform_process_response_sheet, mirroring the
pattern PR #30 established for build_desert_farm.py.

Why: the script's hand-rolled ETL had drifted from the package version.
The previous commit on this branch fixed one symptom (missing
.rename(columns={0: 'x_coords', 1: 'y_coords'}) after result_type=
'expand'), but the underlying issue was duplicated-and-stale code.
Other things load_reference_objects was missing vs the package ETL:
  - Filtering rows where Time_min > Time_max or Space_min > Space_max
  - FillAlpha / TextAlpha (script uses fixed 0.0 for hidden state, so
    these are unused but harmless additions)
  - Time Max / Space Min derived columns (also unused, harmless)

Adapter pattern (same as build_desert_farm.py's load_processes):
  - Rename Name -> FullName so create_name's ShortName fallback inside
    the ETL doesn't overwrite the descriptive name used for hover/labels
  - Map Category -> Color (uppercase to match POSSIBLE_COL_LIST)
  - Set ShortName = FullName (reference objects don't have separate
    short forms, but create_name needs the column to exist)

Behavior verification: regenerated explorer.html is byte-for-byte
identical to the previous commit (688,092 B, 159 coord lists with
content >= 100 chars), so this is a pure code-quality change with
no rendered output difference.

Drops numpy import (was only used in the manual label_x / label_y
computation that the package ETL now handles).
@madsCodeBuddy madsCodeBuddy changed the title Add make build-docs target; fix build_explorer.py end-to-end Add make build-docs target; align build_explorer.py with package ETL Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant