Skip to content

Refactor Supabase persistence: single injected client + non-blocking I/O #8

@nmogil

Description

@nmogil

Summary

The Supabase persistence layer has two problems that are cheap to fix together: (1) it uses the synchronous Supabase client inside async def methods, blocking the event loop; and (2) multiple independent client instances are created across the codebase instead of one shared, injected client.

Current state (verified)

Blocking sync client inside async methods:

  • src/state_supabase.py:22 constructs create_client(...) (the sync client) and then every async def calls .table(...).execute() directly (e.g. get_task line 65, update_task line 60, create_task_with_source_date line 198). These are blocking network calls running on the event loop — under concurrency they stall the whole API/worker.
  • By contrast, src/fetch_service.py already offloads some blocking work via loop.run_in_executor(...) (lines 273, 313), showing the codebase is aware of the issue but applies it inconsistently.

Multiple client instances:

  • src/main.py:14 imports and creates a global Supabase client in lifespan.
  • src/state_supabase.py:6,22 creates its own client from env vars (TaskStateManager.__init__).
  • src/fetch_service.py:7, src/circuit_breaker.py:7, src/cache_supabase.py:4 all take a Client — these are injected (good), but TaskStateManager is the odd one out, self-instantiating a second client and its own connection pool.

Why this matters

  • Blocking calls on the event loop cap real concurrency and make latency spiky; this gets worse after the worker split (heavier sustained load).
  • Two client instances = two connection pools, duplicated config reads, and a second place env vars must be present. It's a foot-gun for consistency and testing (you can't inject a mock into TaskStateManager).

Proposed approach

  1. Single client, injected. Create the Supabase client once (in main.py lifespan, and once in the new worker.py) and pass it into TaskStateManager(client) like the other components already receive it. Remove the env-var/create_client logic from TaskStateManager.__init__.
  2. Stop blocking the loop. Either:
    • wrap the blocking .execute() calls in await asyncio.to_thread(...) (minimal, consistent with fetch_service's executor usage), or
    • migrate to the async Supabase client (create_async_client / AsyncClient) if the pinned supabase==2.x version supports it cleanly. Verify version support before committing to this path.
  3. Make the choice consistent across state_supabase.py, cache_supabase.py, and circuit_breaker.py (all do Supabase I/O inside async contexts).

Files likely involved

  • src/state_supabase.py (remove self-instantiation; accept injected client; de-block I/O)
  • src/main.py (inject the single client into TaskStateManager)
  • src/cache_supabase.py, src/circuit_breaker.py (apply the same de-blocking approach for consistency)
  • src/worker.py (the new worker from the queue issue must also inject the shared client)

Acceptance criteria

  • Exactly one Supabase client is constructed per process and injected everywhere (no create_client inside TaskStateManager).
  • Supabase I/O in async methods no longer blocks the event loop (via to_thread or async client), applied consistently.
  • TaskStateManager can be constructed with a mock/fake client for tests (supports the testing issue).

Gotchas

  • Respect the dependency pins — supabase==2.11.0 and realtime==2.0.0 are pinned to avoid the importlib_metadata boot break (see requirements.lightweight.txt comments and the Fly rebuild trap). If you switch to the async client, re-verify the image boots before deploying.
  • This is a good prerequisite for the testing issue: injectable clients make mocking trivial.

Metadata

Metadata

Assignees

No one assigned

    Labels

    reliabilityReliability / operational robustnesstech-debtCleanup, refactor, paying down debt

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions