Skip to content

Trim recents table via table swap and add cleanup worker cronjob#4613

Open
zwolf wants to merge 12 commits into
masterfrom
recents-refactor
Open

Trim recents table via table swap and add cleanup worker cronjob#4613
zwolf wants to merge 12 commits into
masterfrom
recents-refactor

Conversation

@zwolf
Copy link
Copy Markdown
Member

@zwolf zwolf commented May 6, 2026

The recents table is huge and getting huger. To make the name accurate, we should only retain records from the last 14 days, capped at a maximum of 20 recents per user. This PR executes the initial massive data purge and sets up a Sidekiq cron job to maintain these limits going forward.

Data Migration

Deleting ~400M rows via standard batched DELETE queries would take ages, thrash the database I/O, and leave behind massive index fragmentation (dead tuples). To make this happen with zero downtime and no index bloat, this migration uses a table swap approach:

  • Creates a clone of the table (recents_new)
  • Bulk copies the last 14 days worth of rows (around ~4.5 million) in the background without locking the table
  • Builds indexes non-concurrently from scratch on the new, unused table (including a new user_id/created_at composite index for the worker).
  • Opens a transaction to briefly lock the table, catch up any records inserted during the bulk load, rename the tables, and transfer sequence ownership.
  • The legacy 400M rows are parked in recents_old. I open a follow-up PR to drop that table once we confirm everything is stable.

Cleanup Worker

RecentsCleanupWorker runs two separate sweeps hourly:

Time sweep: Deletes anything older than 14 90 days using in_batches.
Volume sweep: Enforces the 20-record maximum per user, per project.

Deleting this way avoids the "mark now, remove later" antipattern that results in multiple dead tuples (one on update, one on delete) when a row is affected individually. These sweeps bulk delete them by created_at, then by user_id/created_at.

For the volume sweep, querying the entire table with a GROUP BY every hour isn't ideal, even once it reaches a steady state of new and pruned records. So it only queries users who have been active in the last 2 hours (created a new recent) and prunes their extras directly using the new compound index.

If records fall through the cracks (say, a user creates 25 recents and the worker fails for some reason, then they weren't active in the next run's window), those 5 extra records will simply age out and be caught by the 14 90-day time sweep. This isn't very likely, though, and keeps these queries super quick.

I tested the migration locally with a simulated delay to ensure the transaction lock correctly catches concurrent inserts. That is, I threw in a sleep(30) and created some classifications and recents on the console, and they were copied into the new table by the last step. I'm pretty sure that lock required to copy the few tens of thousands of recents during the swap should be nice and quick and shouldn't noticeably affect request time (especially since those queries are already abysmal).

There will be a follow up PR to drop the bloated recents_old table and that space'll get reclaimed by autovacuuming over time.

edit: I'll clean up some of the Hound suggestions, but I'm not using SQL.squish to maintain readability and avoid accidentally concatenating comments.

Review checklist

  • First, the most important one: is this PR small enough that you can actually review it? Feel free to just reject a branch if the changes are hard to review due to the length of the diff.
  • If there are any migrations, will they the previous version of the app work correctly after they've been run (e.g. the don't remove columns still known about by ActiveRecord).
  • If anything changed with regards to the public API, are those changes also documented in the apiary.apib file?
  • Are all the changes covered by tests? Think about any possible edge cases that might be left untested.

Comment thread db/migrate/20260428222525_delete_old_recents.rb
Comment thread db/migrate/20260428222525_delete_old_recents.rb
Comment thread db/migrate/20260428222525_delete_old_recents.rb
Comment thread db/migrate/20260428222525_delete_old_recents.rb
Comment thread db/migrate/20260428222525_delete_old_recents.rb
Comment thread db/migrate/20260428222525_delete_old_recents.rb
Comment thread db/migrate/20260428222525_delete_old_recents.rb
Comment thread app/workers/recents_cleanup_worker.rb
@zwolf
Copy link
Copy Markdown
Member Author

zwolf commented Jun 2, 2026

Rails/SquishedSQLHeredocs: Use <<-SQL.squish instead of <<-SQL.

squish would concatenate the SQL-style comments and cause the migration to break. Leaving unsquished rather than delete them or change their format.

I loosened up some of the other Rubocop rspec guidelines (# of lines & expects) as well as updated the expected Rails version.

@zwolf zwolf force-pushed the recents-refactor branch from a6c6abd to 39662f9 Compare June 2, 2026 18:35
@zwolf
Copy link
Copy Markdown
Member Author

zwolf commented Jun 2, 2026

This is ready for review! @yuenmichelle1 @Tooyosi @lcjohnso

@zwolf zwolf force-pushed the recents-refactor branch from b556911 to ef2682e Compare June 4, 2026 16:29
@zwolf
Copy link
Copy Markdown
Member Author

zwolf commented Jun 4, 2026

Found an error. I hadn't changed the cutoff date from 14 to 90 days in the migration. Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants