Skip to content

Phase 13.1: asyncio.gather concurrent fetcher orchestration (REVIEW C-22) #19

@shiniguchi

Description

@shiniguchi

Background

Surfaced by /gstack-review on Phase 13. See 13-REVIEW.md finding C-22.

Problem

scripts/external/run_all.main() runs the 6 fetchers SERIALLY:

statuses = [
    _run_weather(...),
    _run_holidays(...),
    _run_school(...),
    _run_transit(...),
    _run_events(...),
    _run_shop_calendar(...),
]

Wall time = sum of fetcher times, not max. With Python startup + supabase-py upserts + slow upstream (Open-Meteo's documented ~5s p99 under load), the nightly target <5 min is at risk. Backfill (~322 days × 30-day chunks for weather alone) compounds this.

The 6 fetchers are independent — they share only the Supabase client. Textbook asyncio.gather candidate.

Proposed fix

Convert each fetcher to async (or use concurrent.futures.ThreadPoolExecutor(max_workers=6)). Per-fetcher exception isolation already exists, so parallelization is structurally safe.

Affects:

  • Every _run_X function in scripts/external/run_all.py — make async (or schedule via threadpool).
  • The fetcher modules: weather.py, school.py, transit.py use httpx which has both sync and async APIs — minimal refactor needed.
  • holidays.py (CPU-bound — python-holidays), events.py + shop_calendar.py (file I/O) — schedule via asyncio.to_thread or threadpool.
  • tests/external/test_run_all.py — coverage for concurrent-execution semantics.

Acceptance

  • Wall time = max(fetcher_time), not sum.
  • Per-fetcher exception isolation preserved — one failure does not abort others.
  • Existing exit-code semantics preserved (0 if any source succeeded, 1 if all failed).
  • Benchmark before/after on a representative backfill range.

Effort

~45 min CC. Not a blocker — today's nightly run is well under the 5-min budget.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions