fix(scheduler): seed recurring_jobs via migration + correct worker healthcheck by remyluslosius · Pull Request #388 · Hanalyx/OpenWatch

remyluslosius · 2026-04-14T02:01:04Z

Fixes #383.

Summary

Adds Alembic migration 054_seed_recurring_jobs that inserts the 9 baseline recurring schedules into the recurring_jobs table on every deploy, idempotently
Overrides the worker container healthcheck in docker-compose.yml (was inheriting the backend Dockerfile's curl localhost:8000, which hits nothing in the worker container and reports unhealthy forever)

Why

The PostgreSQL job queue's scheduler polls recurring_jobs every 10s and enqueues due entries. On a fresh deploy, the table was empty and stayed empty — no scheduler, no host monitoring, no compliance scans. Silent failure with no errors logged.

The app/services/job_queue/seed_schedule.py module exists and works, but nothing invoked it: not the worker entrypoint, not docker-compose, not a FastAPI startup hook, not a migration.

Discovered in production 2026-04-13 when worker had been up 5 hours with zero jobs dequeued, last host liveness ping 5 hours stale, last scan 5.5 hours overdue.

Why a migration (vs. entrypoint hook or FastAPI startup event)

Migration: runs once, on schema upgrade, naturally fits Alembic's existing DB provisioning flow. No redundant round-trips on every worker restart. Downgrade path well-defined.
Entrypoint hook (rejected): runs on every worker container start. Wastes DB round-trips. Couples schedule availability to worker lifecycle rather than DB lifecycle.
FastAPI startup event (rejected): wrong layer — couples API startup to scheduler state. If seed fails, does the API refuse to serve?

Idempotency

ON CONFLICT (name) DO NOTHING means the migration is safe to re-run against a DB where someone manually invoked python -m app.services.job_queue.seed_schedule. Validated against the production DB here, which had 8 rows from yesterday's manual seed plus one missing (retention policies, added later); the migration correctly inserted only the missing row.

Future schedule changes

If a new recurring schedule is added to SCHEDULE in app/services/job_queue/seed_schedule.py, add a follow-up migration (055_add_<name>_schedule.py) rather than editing this one. Keeps the migration history honest about what was seeded when.

Worker healthcheck

Replaces curl localhost:8000/health with a SQLAlchemy SELECT 1 against the configured DB URL. Rationale: the worker's only hard dependency is DB connectivity — without it, it can't dequeue or enqueue anything. A "worker is alive" probe that doesn't touch the thing the worker needs is not a healthcheck.

Test plan

Applied migration SQL directly to running DB; RETURNING name reported only the one missing row inserted, 8 conflicts silently skipped
black --check passes on the migration file
Schedulers verified running: dispatch_host_checks + dispatch_compliance_scans executing every 30s/2min, check_host_connectivity fanout working across 7 hosts, run_scheduled_kensa_scan firing for overdue hosts
CI pipeline passes
Fresh-deploy smoke test (spin up clean DB, run alembic upgrade head, verify SELECT count(*) FROM recurring_jobs returns 9)

…althcheck Fixes #383. The adaptive schedulers (host monitoring, compliance scanning) were silently dormant on fresh deploys because recurring_jobs was never populated. The seed_schedule module existed but was never invoked by any startup path. Migration 054 inserts the 9 baseline schedules with ON CONFLICT (name) DO NOTHING so it is idempotent against manual invocations of the seed script and safe to re-run. Downgrade removes only these 9 named rows, leaving operator-added schedules untouched. Also overrides the worker container healthcheck, which inherited the backend Dockerfile's curl-localhost-8000 probe and reported unhealthy forever. The new probe verifies DB connectivity via SQLAlchemy, which is the actual precondition for worker function.

remyluslosius and others added 3 commits April 13, 2026 22:00

Merge branch 'main' into fix/seed-recurring-jobs

2a56ad6

Merge branch 'main' into fix/seed-recurring-jobs

484a64b

remyluslosius merged commit 7286c82 into main Apr 14, 2026
26 checks passed

remyluslosius deleted the fix/seed-recurring-jobs branch April 14, 2026 10:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(scheduler): seed recurring_jobs via migration + correct worker healthcheck#388

fix(scheduler): seed recurring_jobs via migration + correct worker healthcheck#388
remyluslosius merged 3 commits into
mainfrom
fix/seed-recurring-jobs

remyluslosius commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

remyluslosius commented Apr 14, 2026

Summary

Why

Why a migration (vs. entrypoint hook or FastAPI startup event)

Idempotency

Future schedule changes

Worker healthcheck

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant