Make upload-slice sending resilient to WebSocket reconnects#108
Conversation
Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
Confidence Score: 4/5Safe to merge once the missing Drizzle migration for the unique index is generated and committed — without it, the upsert on reconnect silently inserts a duplicate row instead of no-oping. The reconnect logic, idempotent upsert, Redis caching, and test-isolation fix are all correctly implemented and well-tested. The one outstanding gap is that schema.ts declares upload_requests_batchid_key_uidx but no migration file was generated and committed. MySQL's ON DUPLICATE KEY UPDATE path in createUploadRequestsForBatch relies on that constraint existing in the live database; without the DDL applied, each reconnect-triggered resend inserts a fresh row, leaving phantom queued rows that never get a BullMQ job and permanently block batch completion queries. backend/src/db/schema.ts and backend/drizzle/ — the unique index needs a generated migration before deploying. Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant FE as Frontend (useCollections)
participant WS as WebSocket (useSocket)
participant BE as Backend (Handler)
participant DB as MySQL
participant BQ as BullMQ
Note over FE,WS: Initial batch upload
FE->>WS: open()
WS-->>FE: "onOpen → connected=true (sync watcher fires)"
FE->>WS: sendUnackedSlices() → UPLOAD_SLICE[0..N]
WS->>BE: "UPLOAD_SLICE (sliceid=0, items=[...])"
BE->>DB: SELECT pre-existing keys
BE->>DB: INSERT … ON DUPLICATE KEY UPDATE
BE->>DB: SELECT current rows (id, key, status)
DB-->>BE: rows with isNew flag
BE->>BQ: "enqueueUpload (only isNew=true rows)"
BE-->>WS: "UPLOAD_SLICE_ACK (sliceid=0, all row statuses)"
WS-->>FE: onUploadSliceAck → ackedSliceIds.add(0)
Note over FE,WS: WebSocket drop & reconnect
WS-->>FE: "onClose → connected=false"
Note over FE: Slices 1..N not yet acked
WS-->>FE: "onOpen → connected=true (sync watcher fires)"
FE->>FE: onSocketReconnect() → sendUnackedSlices()
FE->>WS: UPLOAD_SLICE[1..N] (unacked only)
WS->>BE: "UPLOAD_SLICE (sliceid=1, same items)"
BE->>DB: "SELECT pre-existing keys → all found (isNew=false)"
BE->>DB: INSERT … ON DUPLICATE KEY UPDATE (no-op)
DB-->>BE: "rows (isNew=false, real current status)"
Note over BE,BQ: Skips rate-limit and enqueue for resent rows
BE-->>WS: "UPLOAD_SLICE_ACK (sliceid=1, current statuses)"
WS-->>FE: onUploadSliceAck → ackedSliceIds.add(1)
Note over FE: All slices acked → sendSubscribeBatch
FE->>WS: SUBSCRIBE_BATCH(batchId)
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant FE as Frontend (useCollections)
participant WS as WebSocket (useSocket)
participant BE as Backend (Handler)
participant DB as MySQL
participant BQ as BullMQ
Note over FE,WS: Initial batch upload
FE->>WS: open()
WS-->>FE: "onOpen → connected=true (sync watcher fires)"
FE->>WS: sendUnackedSlices() → UPLOAD_SLICE[0..N]
WS->>BE: "UPLOAD_SLICE (sliceid=0, items=[...])"
BE->>DB: SELECT pre-existing keys
BE->>DB: INSERT … ON DUPLICATE KEY UPDATE
BE->>DB: SELECT current rows (id, key, status)
DB-->>BE: rows with isNew flag
BE->>BQ: "enqueueUpload (only isNew=true rows)"
BE-->>WS: "UPLOAD_SLICE_ACK (sliceid=0, all row statuses)"
WS-->>FE: onUploadSliceAck → ackedSliceIds.add(0)
Note over FE,WS: WebSocket drop & reconnect
WS-->>FE: "onClose → connected=false"
Note over FE: Slices 1..N not yet acked
WS-->>FE: "onOpen → connected=true (sync watcher fires)"
FE->>FE: onSocketReconnect() → sendUnackedSlices()
FE->>WS: UPLOAD_SLICE[1..N] (unacked only)
WS->>BE: "UPLOAD_SLICE (sliceid=1, same items)"
BE->>DB: "SELECT pre-existing keys → all found (isNew=false)"
BE->>DB: INSERT … ON DUPLICATE KEY UPDATE (no-op)
DB-->>BE: "rows (isNew=false, real current status)"
Note over BE,BQ: Skips rate-limit and enqueue for resent rows
BE-->>WS: "UPLOAD_SLICE_ACK (sliceid=1, current statuses)"
WS-->>FE: onUploadSliceAck → ackedSliceIds.add(1)
Note over FE: All slices acked → sendSubscribeBatch
FE->>WS: SUBSCRIBE_BATCH(batchId)
Reviews (4): Last reviewed commit: "chore: increase slice size to 100" | Re-trigger Greptile |
…ncurrent dispatch Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
…sStatusChecking Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
The frontend now sends all upload slices for a batch upfront instead of waiting for each ack before sending the next, and resends any slice whose ack wasn't received after a WebSocket reconnect (useCollections.ts, useSocket.ts now exposes a reactive
connectedref).Resending a slice requires the backend to be idempotent: createUploadRequestsForBatch now upserts against a new unique index on (batchid, key) instead of plain-inserting, and returns each row's real current status plus an
isNewflag so the handler only runs rate-limiting/enqueueing for genuinely new rows (resent rows are skipped, avoiding duplicate BullMQ jobs and unnecessary rate-limit consumption).Also bundled: per-user rate limits are now cached in Redis for an hour instead of being recomputed from the MediaWiki API on every upload-slice call.
Fixed a pre-existing test-isolation bug surfaced by this work — upload.worker.test.ts and upload.worker.duplicate.test.ts globally mocked
@backend/db/dal/uploadsvia mock.module(), which silently broke uploads.dal.test.ts's real import of UploadService when the full suite ran. Replaced with dependency injection (WorkerDeps.uploads, typed via Pick<UploadService, ...>) so the worker tests no longer need to mock the whole module.— Claude Sonnet 5