perf: parallel per-channel poll fan-out + batched ref-ensure#77
Merged
Conversation
The periodic beat polled every due channel sequentially on one DB session, so a single slow yt-dlp/RSS call delayed every other channel, and _ensure_user_refs ran an N+1 (per video, then per subscriber). Fan the poll out instead: poll_all_channels_task now does only the cheap indexed due-check and dispatches a Celery group of poll_channel_task jobs, one per channel. Each job runs on its own DB session and absorbs its own failures, so one stuck or failing channel no longer blocks or aborts the others; each job reschedules its own adaptive cadence and backs a failing channel off instead of hot-looping. The pull-to-refresh "poll all" endpoint dispatches the same per-channel jobs. Batch the ref-ensure: _ensure_user_refs_bulk issues a bounded two queries (subscribers + existing pairs) across the whole subscriber x video set, replacing the per-video/per-subscriber loop, and the per-video existence SELECT is collapsed into one in_() lookup. poll_all_channels stays as the synchronous reference path the adaptive tests exercise. No schema change. Tests: per-channel fan-out dispatches exactly the due channels; a failing channel job is isolated while the healthy one polls and reschedules; the batched ref-ensure stays at 2 SELECTs regardless of scale. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RXMKM1rDWn8wNh93MMUtxY
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Parallelize channel polling (roadmap LATER tier, #8). Behavioral only — no migration.
Fan-out
poll_all_channels_task(the beat + the pull-to-refreshPOST /api/channels/polltarget) now does only the cheap indexed due-check (list_channel_ids_to_poll), then dispatches a Celerygroupofpoll_channel_task(channel_id)jobs — one per channel, each with its own DB session. So a slow channel no longer delays the rest. Both the beat (due_only=True) and manual poll-all (due_only=False) get parallelism; single-channelPOST /api/channels/{id}/pollkeeps its synchronous path.Isolation
Each per-channel job has its own session + try/except; on failure it rolls back and calls
_backoff_failed_channel(preserving the hot-loop protection by widening cadence) — a sibling's transaction is never touched. Discovery stays timeout-bounded (RSS 15s, yt-dlp 120s), so a slow channel pins one bounded worker slot, not the loop. RSS/304 fast path, first-catalog fallback,new_episodeevents, and auto-download gating are all preserved.Batched ref-ensure
Replaced the per-video
_ensure_user_refs(N+1) with_ensure_user_refs_bulk: 1 query for subscribers + 1 for existing (subscriber×video) pairs + a set-difference insert — 2 SELECTs regardless of scale (verified at (1,1)/(3,5)/(8,8)). Also collapsed the per-video existence check into onein_()lookup.Verification (local)
pytest -q→ 190 passed (+8: beat dispatches exactly due channels; a failing-channel job is isolated while a healthy one still catalogs + shortens cadence; bulk ref-ensure stays at 2 SELECTs).ruffclean · no migration (sanity-checked alembic round-trip anyway).FYI: per-channel poll jobs share the worker pool with downloads; a dedicated poll queue could guarantee throughput later if desired.
🤖 Generated with Claude Code
https://claude.ai/code/session_01RXMKM1rDWn8wNh93MMUtxY