Problem
The Scrapy pipeline (pipelines.py) writes new opportunities only to Google Sheets. The WhatsApp broadcast daemon (broadcast_daemon.py) reads from a local SQLite file — distribution-bridge/whatsapp_queue.db — specifically from a pending_broadcasts table.
These two are never connected. whatsapp_queue.db has no writer. The broadcast daemon always finds an empty queue and skips every group, silently.
Current flow (broken)
GitHub Actions (daily):
Scrapy → SheetsPipeline → Google Sheets ✅
↛ whatsapp_queue.db ← nothing writes here
USA Server (continuous):
broadcast_daemon.py → reads whatsapp_queue.db → always empty → no broadcast ❌
Fix options
Option A — Feed from Google Sheets (recommended, no infra change)
Modify broadcast_daemon.py to call broadcast.py --source sheets instead of reading whatsapp_queue.db. The fetch_from_sheets() function already exists in broadcast.py. This requires no new pipeline code.
Option B — Add a SQLite writer to Scrapy pipeline
Add a WhatsAppQueuePipeline in pipelines.py that writes new items to whatsapp_queue.db alongside the Sheets write. Requires the SQLite file to be accessible from the GitHub Actions runner OR a separate sync step.
Option C — Scrapy pipeline writes JSON; daemon reads it
Have Scrapy export new items to opportunities.json; broadcast_daemon.py reads that file. Works if both run on the same machine. Already partially supported via broadcast.py --source json.
Recommended action
@olamidefasogbon — Option A is the simplest and requires no new infrastructure. The fetch_from_sheets() call already handles Google Sheets auth via service account. The daemon just needs to be updated to use broadcast.py as a library call rather than reading the local DB.
Files involved
broadcast_daemon.py — main daemon loop (reads SQLite)
distribution-bridge/broadcast.py — already has fetch_from_sheets()
scoutbot/pipelines.py — writes to Sheets only
distribution-bridge/whatsapp_queue.db — never populated by any current code
Problem
The Scrapy pipeline (
pipelines.py) writes new opportunities only to Google Sheets. The WhatsApp broadcast daemon (broadcast_daemon.py) reads from a local SQLite file —distribution-bridge/whatsapp_queue.db— specifically from apending_broadcaststable.These two are never connected.
whatsapp_queue.dbhas no writer. The broadcast daemon always finds an empty queue and skips every group, silently.Current flow (broken)
Fix options
Option A — Feed from Google Sheets (recommended, no infra change)
Modify
broadcast_daemon.pyto callbroadcast.py --source sheetsinstead of readingwhatsapp_queue.db. Thefetch_from_sheets()function already exists inbroadcast.py. This requires no new pipeline code.Option B — Add a SQLite writer to Scrapy pipeline
Add a
WhatsAppQueuePipelineinpipelines.pythat writes new items towhatsapp_queue.dbalongside the Sheets write. Requires the SQLite file to be accessible from the GitHub Actions runner OR a separate sync step.Option C — Scrapy pipeline writes JSON; daemon reads it
Have Scrapy export new items to
opportunities.json;broadcast_daemon.pyreads that file. Works if both run on the same machine. Already partially supported viabroadcast.py --source json.Recommended action
@olamidefasogbon — Option A is the simplest and requires no new infrastructure. The
fetch_from_sheets()call already handles Google Sheets auth via service account. The daemon just needs to be updated to usebroadcast.pyas a library call rather than reading the local DB.Files involved
broadcast_daemon.py— main daemon loop (reads SQLite)distribution-bridge/broadcast.py— already hasfetch_from_sheets()scoutbot/pipelines.py— writes to Sheets onlydistribution-bridge/whatsapp_queue.db— never populated by any current code