Skip to content

fix(ingestion): commit every 25 rows in summaries and embeddings jobs#66

Merged
erincon01 merged 1 commit into
developfrom
fix/063-intermediate-commits
Apr 21, 2026
Merged

fix(ingestion): commit every 25 rows in summaries and embeddings jobs#66
erincon01 merged 1 commit into
developfrom
fix/063-intermediate-commits

Conversation

@erincon01

Copy link
Copy Markdown
Owner

Summary

Both run_generate_summaries_job and run_rebuild_embeddings_job ran inside a single database transaction. If the job failed at row 1500 of 1986, all 1500 summaries were lost.

Fix: conn.commit() every 25 rows (aligned with the existing progress logging interval).

Impact

  • Summaries job: ~5 min of work survives a crash instead of 0
  • Embeddings job: same improvement
  • No behavior change for successful jobs

Test plan

  • Backend: 530 passed, 0 failed

Both jobs ran inside a single transaction — if the job failed or the
server restarted mid-run, all progress was lost. Now commit every
25 rows so progress is durable. A failure at row 1500 preserves
the first 1475 summaries/embeddings.
@erincon01 erincon01 merged commit 3f85854 into develop Apr 21, 2026
5 checks passed
@erincon01 erincon01 deleted the fix/063-intermediate-commits branch April 21, 2026 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant