Skip to content

feat: GTFS streaming import, realtime retention, heatmap perf, frontend UI overhaul, and reliability fixes#243

Merged
rburketaylor merged 77 commits into
mainfrom
develop
May 1, 2026
Merged

feat: GTFS streaming import, realtime retention, heatmap perf, frontend UI overhaul, and reliability fixes#243
rburketaylor merged 77 commits into
mainfrom
develop

Conversation

@rburketaylor

Copy link
Copy Markdown
Owner

Summary

76 commits, 150 files changed (+15,416 / -3,108). Merge base: Feb 10, 2026.


GTFS Import & Storage Overhaul

  • Streaming stop_times import mode with live progress tracking (visible in Ingestion tab)
  • Compact static GTFS schema (5 Alembic migrations: schema compaction, stop_times seconds conversion, parent station FK, search indexes, FK cascade fix)
  • Streaming GTFS download with improved error logging
  • Heavy rewrite of gtfs_feed.py (~996 lines changed), new gtfs_import_progress.py, gtfs_import_lock.py refactored
  • Tests: test_gtfs_feed_importer.py expanded (+1,775), test_gtfs_import_progress.py new

Real-time Processing & Retention

  • New validated hourly retention cleanup job (realtime_retention_service.py, +314 lines)
  • GTFS-RT locking and processing semantics stabilized
  • gtfs_realtime_harvester.py refactored (+331/-)
  • Alembic migration for FK cascade fix

Heatmap Performance

  • Station breakdown aggregation pushed into SQL (eliminates N+1)
  • heatmap_service.py refactored (+460/-), daily_aggregation_service.py refactored (+225/-)
  • Cache warmup hardened for heatmap
  • New heatmap indexes migration

Frontend UI/UX Overhaul

  • App shell theming refresh and modernized layouts
  • Shared Radix UI primitives: Accordion, Avatar, Dialog, Progress, Select, Separator, Skeleton, Tooltip
  • Dialog transitions and select error states
  • StationPage major rewrite (+820/-), HeatmapControls rewrite (+456/-)
  • Live time range in station stats and heatmap popups
  • Heatmap legend/marker style unification
  • Cancellation query key canonicalization
  • useAutoRefresh hook with tests

Backend Reliability Fixes

  • GTFS schema constraint enforcement and deterministic upserts
  • Cache single-flight and miss hardening
  • GTFS-RT lock/processing semantics fixes
  • Transit validation and aggregation edge case tightening
  • Frontend API refresh overlap prevention
  • Readiness endpoint error payload standardization

Shared Infrastructure

  • Efficiency controls API timing in core/config.py and core/metrics.py
  • Health and metrics endpoints
  • Main app lifecycle changes

Bolt Performance Optimizations (merged PRs)

  • JSON serialization optimization in cache service
  • GTFSScheduleService.get_stop_departures query optimization
  • Datetime calculation optimization in schedule service
  • StopInfo/RouteInfo explicit to_dict serialization

Dependencies & Build

  • 11 dependabot merge commits across Python/Node/Docker
  • CI: trivy-action bump, Dockerfile socat fix + alembic inclusion
  • Frontend dep bumps with minimatch override

Docs & Skills

  • GTFS-RT monitoring operations guide (+326 lines)
  • gh-code-scanning-triage skill + check script
  • atomic-commits skill workflow refinements
  • Plans under docs/plans/ (efficiency optimizations, fraud, profiling, silent-fallbacks, voiceover demo)

Risk Assessment

Risk What Mitigation
High GTFS schema compaction + migration chain (5 migrations) Must run in order; requires verifiable rollback. Test against production-sized data.
High Streaming import mode changes ingestion path Ensure backward compat with non-streaming env var default.
Medium Heatmap cache warmup and SQL aggregation Rerun full warmup after deploy; verify metric parity.
Medium Radix UI dep additions Check bundle size and theme compatibility.

Merge Strategy

Merge commit — preserves topology of 76 commits including 11 feature/dependabot merge commits.

google-labs-jules Bot and others added 30 commits February 2, 2026 00:52
Replaced `dataclasses.asdict` with explicit `to_dict` methods in `StopInfo` and `RouteInfo` to eliminate reflection overhead during cache operations.
Implemented `from_dict` methods to ensure correct object reconstruction, including handling nested `ServiceAlert` objects and non-JSON types like sets and datetimes.
Added `ServiceAlert.from_dict` to centralize deserialization logic.
Benchmarks show ~10x improvement in serialization speed for complex objects.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>
Resolves a linting error caused by removing `asdict` usage in the previous optimization commit.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>
Ran `ruff format` on `backend/tests/services/test_transit_data.py` to fix formatting errors detected by CI.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.14.14 to 0.15.0.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](astral-sh/ruff@0.14.14...0.15.0)

---
updated-dependencies:
- dependency-name: ruff
  dependency-version: 0.15.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [globals](https://github.com/sindresorhus/globals) from 17.1.0 to 17.3.0.
- [Release notes](https://github.com/sindresorhus/globals/releases)
- [Commits](sindresorhus/globals@v17.1.0...v17.3.0)

---
updated-dependencies:
- dependency-name: globals
  dependency-version: 17.3.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [autoprefixer](https://github.com/postcss/autoprefixer) from 10.4.23 to 10.4.24.
- [Release notes](https://github.com/postcss/autoprefixer/releases)
- [Changelog](https://github.com/postcss/autoprefixer/blob/main/CHANGELOG.md)
- [Commits](postcss/autoprefixer@10.4.23...10.4.24)

---
updated-dependencies:
- dependency-name: autoprefixer
  dependency-version: 10.4.24
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [msw](https://github.com/mswjs/msw) from 2.12.7 to 2.12.9.
- [Release notes](https://github.com/mswjs/msw/releases)
- [Changelog](https://github.com/mswjs/msw/blob/main/CHANGELOG.md)
- [Commits](mswjs/msw@v2.12.7...v2.12.9)

---
updated-dependencies:
- dependency-name: msw
  dependency-version: 2.12.9
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
- Pre-calculate `service_midnight` in `get_stop_departures` to avoid redundant `datetime.combine` calls for every row.
- Update `interval_to_datetime` to accept an optional `base_datetime`.
- Add test case verifying the optimization.

Reduces object creation overhead in the hot loop of departure scheduling.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>
…-datetime-6044823311474853755

⚡ Bolt: Optimize datetime calculations in GTFSScheduleService
…serialization-8331816051012334399

⚡ Bolt: Optimize Transit Data Serialization
…updates

Bumps the production-npm group with 3 updates in the /frontend directory: [@tanstack/react-query](https://github.com/TanStack/query/tree/HEAD/packages/react-query), [react](https://github.com/facebook/react/tree/HEAD/packages/react) and [react-dom](https://github.com/facebook/react/tree/HEAD/packages/react-dom).


Updates `@tanstack/react-query` from 5.90.20 to 5.90.21
- [Release notes](https://github.com/TanStack/query/releases)
- [Changelog](https://github.com/TanStack/query/blob/main/packages/react-query/CHANGELOG.md)
- [Commits](https://github.com/TanStack/query/commits/@tanstack/react-query@5.90.21/packages/react-query)

Updates `react` from 19.2.3 to 19.2.4
- [Release notes](https://github.com/facebook/react/releases)
- [Changelog](https://github.com/facebook/react/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/react/commits/v19.2.4/packages/react)

Updates `react-dom` from 19.2.3 to 19.2.4
- [Release notes](https://github.com/facebook/react/releases)
- [Changelog](https://github.com/facebook/react/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/react/commits/v19.2.4/packages/react-dom)

---
updated-dependencies:
- dependency-name: "@tanstack/react-query"
  dependency-version: 5.90.21
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-npm
- dependency-name: react
  dependency-version: 19.2.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-npm
- dependency-name: react-dom
  dependency-version: 19.2.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-npm
...

Signed-off-by: dependabot[bot] <support@github.com>
… 2 updates

Bumps the production-python group with 2 updates in the /backend directory: [fastapi](https://github.com/fastapi/fastapi) and [alembic](https://github.com/sqlalchemy/alembic).


Updates `fastapi` from 0.128.0 to 0.128.5
- [Release notes](https://github.com/fastapi/fastapi/releases)
- [Commits](fastapi/fastapi@0.128.0...0.128.5)

Updates `alembic` from 1.18.1 to 1.18.3
- [Release notes](https://github.com/sqlalchemy/alembic/releases)
- [Changelog](https://github.com/sqlalchemy/alembic/blob/main/CHANGES)
- [Commits](https://github.com/sqlalchemy/alembic/commits)

---
updated-dependencies:
- dependency-name: fastapi
  dependency-version: 0.128.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-python
- dependency-name: alembic
  dependency-version: 1.18.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production-python
...

Signed-off-by: dependabot[bot] <support@github.com>
…pdates

Bumps the docker-images group with 6 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| prom/prometheus | `v3.5.0` | `v3.9.1` |
| grafana/grafana | `12.1.1` | `12.3.2` |
| cadvisor/cadvisor | `v0.52.1` | `v0.55.1` |
| prometheuscommunity/postgres-exporter | `v0.18.1` | `v0.19.0` |
| oliver006/redis_exporter | `v1.74.0` | `v1.80.2` |
| alpine | `3.23.2` | `3.23.3` |



Updates `prom/prometheus` from v3.5.0 to v3.9.1

Updates `grafana/grafana` from 12.1.1 to 12.3.2

Updates `cadvisor/cadvisor` from v0.52.1 to v0.55.1

Updates `prometheuscommunity/postgres-exporter` from v0.18.1 to v0.19.0

Updates `oliver006/redis_exporter` from v1.74.0 to v1.80.2

Updates `alpine` from 3.23.2 to 3.23.3

---
updated-dependencies:
- dependency-name: prom/prometheus
  dependency-version: v3.9.1
  dependency-type: direct:production
  dependency-group: docker-images
- dependency-name: grafana/grafana
  dependency-version: 12.3.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: docker-images
- dependency-name: cadvisor/cadvisor
  dependency-version: v0.55.1
  dependency-type: direct:production
  dependency-group: docker-images
- dependency-name: prometheuscommunity/postgres-exporter
  dependency-version: v0.19.0
  dependency-type: direct:production
  dependency-group: docker-images
- dependency-name: oliver006/redis_exporter
  dependency-version: v1.80.2
  dependency-type: direct:production
  dependency-group: docker-images
- dependency-name: alpine
  dependency-version: 3.23.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: docker-images
...

Signed-off-by: dependabot[bot] <support@github.com>
…ntend/develop/production-npm-97c96886d3

build(deps): bump the production-npm group across 1 directory with 3 updates
…ocker-images-58da121820

build(deps): bump the docker-images group across 1 directory with 6 updates
…ntend/msw-2.12.9

build(deps-dev): bump msw from 2.12.7 to 2.12.9 in /frontend
…uction-python-df6cb66f74

build(deps): bump the production-python group across 1 directory with 2 updates
…ntend/globals-17.3.0

build(deps-dev): bump globals from 17.1.0 to 17.3.0 in /frontend
…-0.15.0

build(deps): bump ruff from 0.14.14 to 0.15.0 in /backend
…ntend/autoprefixer-10.4.24

build(deps-dev): bump autoprefixer from 10.4.23 to 10.4.24 in /frontend
Bumps [jsdom](https://github.com/jsdom/jsdom) from 27.4.0 to 28.0.0.
- [Release notes](https://github.com/jsdom/jsdom/releases)
- [Changelog](https://github.com/jsdom/jsdom/blob/main/Changelog.md)
- [Commits](jsdom/jsdom@27.4.0...28.0.0)

---
updated-dependencies:
- dependency-name: jsdom
  dependency-version: 28.0.0
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
…ntend/jsdom-28.0.0

build(deps-dev): bump jsdom from 27.4.0 to 28.0.0 in /frontend
Refactors the departures query to pre-fetch active service IDs for the day, removing complex joins with `gtfs_calendar` and `gtfs_calendar_dates` from the main query. This simplifies the execution plan for the large `gtfs_stop_times` table.

- Adds `get_active_service_ids` helper method
- Updates `get_stop_departures` to use `service_id IN (...)`
- Updates tests to mock the new query structure

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>
Refactors the departures query to pre-fetch active service IDs for the day, removing complex joins with `gtfs_calendar` and `gtfs_calendar_dates` from the main query. This simplifies the execution plan for the large `gtfs_stop_times` table.

- Adds `get_active_service_ids` helper method
- Updates `get_stop_departures` to use `service_id IN (...)`
- Updates tests to mock the new query structure
- Formats test file to satisfy linter

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>
…-query-8087736883063608687

⚡ Bolt: Optimize GTFS Schedule Query
Add OpenAI agent interface metadata for the existing atomic-commits skill.
This makes the skill discoverable with a display name, description, and default prompt.
Add a new gh-code-scanning-triage skill with workflow guidance and agent metadata.
Include an executable check-alerts.sh script to fetch, normalize, and summarize GitHub code scanning alerts.
Update the atomic-commits skill to run pre-commit on staged files per commit,
then run a final pre-commit --all-files check after all commits are created.
This catches new files earlier while keeping full-repo validation before handoff.
rburketaylor and others added 24 commits March 8, 2026 09:33
Delete completed logic-bugs analysis and remediation plan documents
from docs/plans/refactor now that the remediation scope is closed.
* Optimize JSON serialization in cache service

Leverage `to_dict()` methods on objects for fast JSON serialization,
avoiding the slow recursive traversal overhead of FastAPI's
`jsonable_encoder`.

Adds `_fast_encoder` default handler to `json.dumps` calls in
`mset_json` and `set_json`.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>

* Add tests for _fast_encoder to improve coverage

The previous commit introduced _fast_encoder to optimize JSON serialization
but did not include tests for it, causing a codecov patch coverage
failure. This commit adds a unit test for _fast_encoder.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>

* Fix codecov patch coverage by testing edge cases for fast encoder

The previous commit introduced _fast_encoder but did not fully
cover all edge cases, leading to a patch coverage of 60%. This commit
adds tests for `None`, lists, and dictionaries.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>

* Fix GitHub Actions deprecation warnings and E2E timeouts

This commit addresses:
1. Node 20 deprecation warnings by updating `aquasecurity/trivy-action`
   to `v0.35.0`, `docker/setup-buildx-action` to `v3.10.0`, and
   `docker/build-push-action` to `v6.15.0`.
2. E2E test failures in `monitoring.spec.ts` by updating locators for
   tabs from `getByRole('button')` to `getByRole('tab')` in accordance
   with Radix UI semantics.
3. E2E test failures in `station.spec.ts` by updating heading level
   assertions from `level: 1` to `level: 3` and using `.first()`.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>

* Fix E2E test locator timeouts

This commit updates locators in E2E tests:
1. `monitoring.spec.ts`: use `getByRole('tab')` for tabs instead of `button`.
2. `station.spec.ts` & `flows/user-journeys.spec.ts`: use `getByRole('heading', { level: 3 })` for station headings (due to `CardTitle` wrapper) instead of `level: 1`.
3. `station.spec.ts`: relax the error text match from "Error|failed|departures" to "Error fetching" based on what the UI actually renders.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>

* Fix formatting issues in E2E tests

Run Prettier to fix formatting violations in tests/e2e/pages/station.spec.ts
that were causing the `npm run lint` step to fail during CI.

Co-authored-by: rburketaylor <54682710+rburketaylor@users.noreply.github.com>

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Replace in-memory GTFS feed download with streaming to a temporary
.part file to reduce memory pressure on large feeds. Add progress
tracking (bytes downloaded) and content-length logging for
observability. Use logger.exception instead of logger.error to
preserve full tracebacks on failures. Fix downloaded_at to be
timezone-naive UTC for consistent database storage.

Update tests to mock httpx stream API and async context managers.
… deps

Remove erroneous -r2 suffix from alpine/socat image tags in
docker-compose.yml. Copy alembic files into the backend Docker image
and run migrations on container startup before uvicorn.

Bump vulnerable dependencies: pytest 9.0.2→9.0.3, black 26.1.0→26.3.1,
python-dotenv 1.2.1→1.2.2, requests 2.32.5→2.33.0, pygments 2.19.2→2.20.0.
…popups

Add 'live' to the StationStatsTimeRange union type so the popup can
request live stats when the heatmap is in live mode. Remove the
intermediate remapping of 'live' → '24h' in HeatmapPage and pass the
selected timeRange directly to useStationStats.

Add tests verifying the hook accepts 'live' and that the page wires
live station detail requests correctly.
Export BVV_POINT_COLOR_STOPS and BVV_CLUSTER_COLOR_STOPS from
markerStyles.ts so the map expressions and legend share a single source
of truth. Drive HeatmapLegend gradient and swatch colors from these
constants via getBVVMarkerColor, eliminating duplicated theme-dependent
color arrays and removing the ThemeContext dependency.

Update HeatmapLegend tests to assert swatch colors match the shared
marker style function.
…alls

Ignore .jules/ directory and root-level node_modules/package.json from
accidental npm installs. Remove the previously tracked .jules/bolt.md
file from version control.
Update aquasecurity/trivy-action from v0.35.0 to v0.36.0 for
both backend and frontend container image scanning steps.
- dompurify: 3.3.1 -> 3.4.1 (production)
- postcss: 8.5.6 -> 8.5.10 (dev)
- vite: 7.3.1 -> 7.3.2 (dev)
- Add minimatch override (^10.2.5) to resolve transitive dependency
…bleshooting

Add GUARD annotations preventing destructive git resets, introduce
per-file unstaging with `git reset HEAD -- <file>`, move pre-commit
checking before plan presentation, add a comprehensive troubleshooting
table, and clarify rules for HEREDOC, amend, and push operations.
Add compiled efficiency optimization planning materials, archived model inputs, and an implementation audit for the backend optimization work.
Add backend efficiency configuration, bounded fallback cache support, cache pattern invalidation, API latency metrics, Server-Timing headers, departure cache minute bucketing, and a pinned Valkey image.
Remove the static stops cascade dependency from realtime station stats and add a retention service that validates daily rollups before deleting old hourly realtime rows.
Store GTFS stop times as integer seconds, compact static GTFS table metadata, add stop search indexes, validate feeds before final replacement, batch stop_times imports, clean feed archives, and cover migration revision id limits.
Use SQL CTEs to aggregate heatmap station breakdowns, add a supporting realtime stats index, cache GTFS route type maps by feed, and tighten daily aggregation consistency coverage.
Update the local atomic-commits skill to validate before planning, repair failures first, preserve staging intent, and run staged-file hooks for each commit.
Add Valkey-backed GTFSImportProgressTracker that reports import state
(phase, percent, row counts) with 24h TTL and process-local fallback.
Wire tracker into GTFSFeedImporter lifecycle (start/update/succeed/fail
per phase), scheduler, CLI script, and /ingestion-status API endpoint.

Also refactor GTFS time parsing from map_elements to native Polars
expressions and extract batch task error propagation into a shared
helper for cleaner stop_times COPY error handling.
Add ImportProgressPanel component that displays a live progress bar
with percentage and stop_times row counts during running imports, and
an error panel with error type/message for failed imports. Bumps the
polling interval to 5s while an import is active (30s idle).
Add a streaming import strategy for GTFS stop_times that uses Polars
lazy scan_csv + sink_csv to transform rows into a temp CSV, then a
single asyncpg COPY to load them — avoiding the eager read_csv_batched
path that materialises all batches in memory.

- New config setting GTFS_STOP_TIMES_IMPORT_MODE (streaming|batched),
  default "streaming". Both zip and directory import paths dispatch on
  this setting.
- Streaming path drops PK before COPY and recreates it afterwards for
  fastest bulk load; replaces the redundant trip_id index with a
  (trip_id, stop_sequence) primary key that covers the same lookups.
- Refactor _validate_route_and_service_ids to use Polars expressions
  instead of Python sets for route/service ID validation.
- Tests: update _make_settings with import_mode param, fix mock targets
  (streaming by default, batched where needed), add 6 new streaming
  tests covering dispatch, temp-file cleanup, and output parity.
Add GTFS_STOP_TIMES_IMPORT_MODE description and update
GTFS_STOP_TIMES_BATCH_SIZE to reflect batched vs streaming semantics.
Comment thread backend/alembic/versions/convert_gtfs_stop_times_to_seconds.py Fixed
Comment thread backend/alembic/versions/convert_gtfs_stop_times_to_seconds.py Fixed
Comment thread backend/alembic/versions/convert_gtfs_stop_times_to_seconds.py Fixed
Comment thread backend/alembic/versions/convert_gtfs_stop_times_to_seconds.py Fixed
Comment thread backend/app/services/gtfs_realtime_harvester.py Fixed
- Replace f-strings in op.execute() with plain string literals in
  convert_gtfs_stop_times_to_seconds.py and
  remove_realtime_station_stats_stop_fk_cascade.py
- Add CREATE INDEX IF NOT EXISTS to add_heatmap_indexes.py to make
  migration idempotent across upgrade/downgrade cycles
- Add missing realtime_station_stats tables to reset_database.py drop
  list to ensure clean migration test resets
- Replace MD5 with SHA256 in _hash_trip_id_legacy (remove noqa bandaid)
- Add docker compose diagnostic logging to CI e2e job
Postgres 18+ Docker images now store data in a version-specific subdirectory.
The mount point must be /var/lib/postgresql (parent) instead of
/var/lib/postgresql/data to allow the entrypoint to create the
proper layout with major-version-specific directory names.
@rburketaylor rburketaylor merged commit 296fe07 into main May 1, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants