Skip to content

Post-deploy (one-time): backfill nest-hole snips for captures already on the server #189

Description

@cofade

Blocked: do not start until the YOLO26n-seg detector (ADR-027) is deployed to the production server. This is a one-time post-deploy cleanup, not part of the feature PR.

Context

The learned hole detector now runs only on /upload (image-service UploadPipeline._detectHoleDetector.detect). Captures already on the server before the deploy are never reprocessed, so existing modules show no nest-hole snip grid until their next upload.

Because modules upload ~once per module per day, every module self-populates within ~24 h of the deploy with no action needed — and there's no incorrect data in the meantime (the old HoughCircles detector recorded zero detections on real captures, so the grid is simply empty until the next upload, never wrong). This issue exists purely to fill the grids immediately if we don't want to wait for the daily cycle.

Why this must run after deploy

The backfill runs against the deployed image-service: it uses the live ONNX detector and records through the deployed /record_detections, which only accepts the new undetermined state after this version ships. Running it against the old image-service would no-op or hit the previous contract.

Task — one-time backfill script

A one-shot script (e.g. image-service/scripts/backfill_snips.py, run via
docker compose -f docker-compose.prod.yml exec image-service python -m scripts.backfill_snips) that:

  1. Queries duckdb image_uploads for the latest filename per module. The dashboard GET /detections read folds to the latest capture per module, so only the most-recent stored image affects the grid — no need to grind the whole archive.
  2. Runs HoleDetector().detect() on /data/images/<filename> for each module.
  3. record_detections the resulting snips (the same persistence path /upload uses), writing the snip JPEGs to /data/images/snips/.
  4. Flags: --dry-run (report what it would do, no writes) and --all-images (process full history instead of latest-per-module).

Acceptance criteria

  • Running on prod populates the snip grid for every module with at least one detectable stored capture.
  • Idempotent — re-running doesn't duplicate or crash (the /detections per-nest dedup already covers same-capture re-records).
  • --dry-run reports counts without writing.
  • Graceful — a capture the model finds nothing in is skipped, never fatal.
  • A short post-deploy step is documented in docs/07-deployment-view/production-deployment.md.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions