feat: enforce per-subscription retention policy#78
Merged
Conversation
retention_policy/retention_count on UserSubscription were persisted but never applied — a self-hoster who set a policy got nothing. Add a periodic sweep that makes them real. A new Celery beat task (enforce_retention_task, every retention_interval_hours, default 6h) applies each subscription's policy: - KEEP_LAST_N: keep the newest N downloaded (COMPLETE) videos per (user, channel) and soft-remove the user's UserVideoRefs for the rest, ordered by upload date (falling back to catalog time) with a stable id tiebreak. - KEEP_ALL (default): no-op. - KEEP_WATCHED: recognized but intentionally left unenforced — its only sensible reading is destructive — pending product sign-off. - A missing/non-positive count, or an unknown policy, is treated as a no-op. Reclamation reuses the existing orphan cleanup rather than deleting files here: the sweep soft-removes refs, then runs check_and_delete_orphan, which only deletes the file and resets the Video row once no active ref remains. So shared downloads stay ref-counted — soft-removing one user's ref never removes a video another active ref still wants — and only videos the user actually downloaded are ever touched. check_and_delete_orphan gains a synchronous twin (check_and_delete_orphan_sync) for the Celery worker's sync Session; both share one reclaim helper so the file-deletion + row-reset logic lives in a single place. No schema change (the columns already existed). Tests: KEEP_LAST_N drops refs beyond N (newest kept) and orphan cleanup reclaims; KEEP_ALL is a no-op; a video another active ref still wants is not reclaimed; only downloaded videos count; a policy with no count is a no-op. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RXMKM1rDWn8wNh93MMUtxY
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Make
retention_policy/retention_countreal (roadmap LATER tier, #12) — they were accepted + persisted but never enforced. Additive — no schema change.Where / semantics
The fields live on
UserSubscription(per-user, per-channel). The real wire values come from the clients:KEEP_ALL,KEEP_LAST_N,KEEP_WATCHED(the audit's "KEEP_LATEST_N" was a guess — matched the clients).Sweep (
app/services/retention.py→enforce_retention)KEEP_LAST_N: for each subscription, keep the newestretention_countdownloaded (COMPLETE) videos (bycoalesce(uploaded_at, created_at)desc) and soft-remove the user'sUserVideoReffor the rest. Missing/non-positive count → no-op (never interpreted as "delete all").check_and_delete_orphan_sync, a new sync twin of the async one sharing one_reclaim_orphan_artifactshelper) — files are deleted + the Video row reset to CATALOGED only when no active ref remains, so shared downloads stay ref-counted and safe. Only touches videos the user actually downloaded.KEEP_ALL/ unknown → no-op.KEEP_WATCHEDis intentionally NOT enforced (its literal reading would delete unwatched videos — destructive; dispatch is structured to add a defined behavior later).enforce-retentioneveryRETENTION_INTERVAL_HOURS(default 6h).Verification (local)
pytest -q→ 195 passed (+5: KEEP_LAST_N drops beyond N + reclaims; KEEP_ALL no-op; shared video with another active ref not reclaimed; only downloaded videos count; no-count no-op).ruff+mypyclean · no migration.🤖 Generated with Claude Code
https://claude.ai/code/session_01RXMKM1rDWn8wNh93MMUtxY