fix: B2C↔B2B seam closeout — wallet-refund audit, cancel-pending-booking, payout consolidation, waitlist crons#854
Conversation
…edits The refund cascade's WALLET leg branch credited the org wallet with no audit artifact. Org-wallet-funded B2C bookings (Payment.organizationId null, wallet owned by an org) were fully invisible to the org audit surface; org-tagged payments only got the payment-level INVOICE row, which the wallet-filtered view misses. walletCredit now returns the billing account's ownerOrgId (already selected, zero extra queries) and the cascade writes a category:WALLET / action:WALLET_REFUND row per wallet leg, org resolved via payment.organizationId ?? ownerOrgId, with balanceAfterPaise in details. Step 8's INVOICE row is unchanged — org-tagged wallet payments intentionally get both the funding-level and payment-level rows. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…/pending/[paymentId] A user who abandoned checkout had no way to release their own tentative hold — they waited out the 24h cleanup window (#833 residue). New cancelPendingCheckout core (Serializable tx, withSerializableRetry): - CAS claim PENDING→EXPIRED via guarded updateMany — the cancel-vs-webhook race closer per the #836/#838 doctrine; loser gets 409 - referral credits restored (reverseCreditsForPayment, same step as the abandoned-payments cron) - tentative slots released with per-type scoping mirroring the webhook's confirm scope: class = caller's slots across all session appointments, webinar = caller's slot only, consultation/subscription = all tentative - parent → CANCELLED via the CAS transition map with a NARROW fromIn (PENDING/APPROVED_PENDING_PAYMENT) — never cancels a parent another payment already confirmed; IllegalTransitionError rolls everything back - post-commit best-effort gateway cancel (never un-cancels on failure) Route: DELETE with getSession + new cancelPendingLimiter (10/min, own bucket so cancelling never consumes checkout quota). Dashboard: Cancel button beside the Processing pill on gateway_pending items in PendingPaymentsWidget. Tests: 10 unit cases + chaos scenario test-cancel-pending-vs-webhook (cancel-vs-capture-webhook with consistent end-state assertions incl. the documented late-capture orphan, plus cancel-vs-cancel double-submit leg). Known residual (follow-up issue): a genuine late capture can resurrect an EXPIRED payment to SUCCEEDED (handlers.ts confirm path) — orphan is flagged by the reconciler for refund; never a half-confirmed booking. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…delete scripts copy Two implementations of the most money-critical path was drift waiting to happen. The weekly GH job (jobs/payouts/process-payouts.ts) now drives lib/payments/payouts directly; scripts/payouts/process-payouts.ts is deleted (no shim — zero external importers besides the jobs wrapper and the test, both re-pointed). Closes the TODO #620 migration note. Deliberate behavior changes for the weekly job (decided on #850 review — the issue's 'no behaviour change' claim was wrong): - TDS 194-O withholding now applies (canonical compliance engine, TDS_ENGINE flag semantics unchanged) - Redis lock:payout_processing + retryCount < MAX filter now apply - org payouts (PENDING→PROCESSING advancement) ride the weekly job via new org-payout-service.processPendingOrgPayouts() Hardening ported INTO the service during consolidation: - #776 CAS claim: updateMany {id, status: APPROVED} — the service's plain update() could double-submit when two runners raced with Redis down; losers return skipped:true and never touch the gateway - silent-skip guard in the jobs wrapper: empty result + APPROVED payouts waiting (lock held / Redis down) now exits 1 instead of a green no-op - workflow env fix: the service reads RAZORPAYX_KEY_SECRET || RAZORPAY_SECRET — process-payouts.yml only set RAZORPAY_KEY_SECRET, which would have mass-failed every payout post-migration; RAZORPAY_SECRET is now mapped from the same secret - package.json scripts:process-payouts pointed at a nonexistent file; re-pointed at the jobs wrapper Test retargeted to the service path: both #785 quarantine invariants (post-gateway fail → PROCESSING + earnings linked; pre-gateway fail → FAILED + unlinked) plus a new CAS-skip case (gateway never called). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…pooler timeouts Both waitlist scripts imported CronLockHeldError but never called withCronLock — the catch blocks were dead code and concurrent runs could overlap. The scripts now wrap their bodies in withCronLock(<same names as the jobs/waitlist wrappers>, failMode 'open') so every entry point mutually excludes. The recurring scheduled failures (June 6/7/10) were Prisma ETIMEDOUTs on the FIRST query from GitHub runners against the Supabase pooler — Prisma 7 surfaces them as KnownRequestError code 'ETIMEDOUT'. New lib/db/connect-retry.ts (withDbConnectRetry, sibling of serializable-retry) retries only connectivity-shaped failures (ETIMEDOUT/ECONNRESET/ECONNREFUSED/P1001/P1002/init errors) with 2s/4s/8s backoff + jitter; business errors propagate immediately. Re-running the bodies is safe: reminderSentAt bounds re-sends, expiry processing is CAS-guarded. The other half of these issues — the module-load crash on missing UPSTASH_* env — was already fixed by 433091e (PR #851). Note: jobs/waitlist/* wrappers remain orphaned dead code (workflows call the scripts directly); flagged for a follow-up cleanup, not deleted here. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ated by this train TESTING_PLAN.md: route matrix with complex params for the four fixes (#849 cancel-pending authz/CAS matrix, #835 wallet-refund audit asserts, #850 payout job incl. concurrent-dispatch and silent-skip checks, #814/#821 cron lock-held/retry behavior), a 5-agent parallel execution protocol with deliberately colliding pairs, and per-scenario fixture + verification snippets (cancel-vs-cancel, cancel-vs-webhook, last-seat storm, waitlist-expire-vs-book, idempotency replay). Doc syncs: tentative TTL 7d→24h (#833) + self-serve release pointer in the cron doc; payout script paths re-pointed at jobs/payouts + lib/payments/payouts after the #850 deletion. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughMoves payout runner to canonical service with CAS-skip semantics and org-batch processing; adds atomic user-facing DELETE /api/checkout/pending endpoint with rate limiting and dashboard cancel UX; emits WALLET_REFUND org-audit logs during refund cascades; introduces DB-connect retry utilities and refactors waitlist scripts into exportable processors plus job wrappers; expands tests and adds TESTING_PLAN and workflow updates. ChangesPayout Service Consolidation
Cancel Pending Checkout Feature
Wallet Refund Audit Logging
Database Connection Resilience & Waitlist Refactor
Testing Plan and Documentation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Code Review
This pull request introduces a self-serve path for users to cancel pending checkouts and release tentative holds immediately via a new DELETE /api/checkout/pending/[paymentId] endpoint, backed by serializable transaction logic and comprehensive race-condition tests. It also updates the refund cascade to log wallet refunds to the organization audit feed, migrates the weekly payout job to the canonical payout service with TDS withholding and Redis locking, and adds transient database connection retry logic. The review feedback suggests adding defensive checks to prevent potential null pointer or database constraint errors when canceling payment intents or creating organization audit logs.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
✅ Deploy Preview for familiarise ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
…s-core shape Convention recheck across all 54 cron workflows: every one calls a jobs/<domain>/<name>.ts entry except the two waitlist workflows, which invoked scripts/ directly while jobs/waitlist/* held full DUPLICATE implementations with zero callers — already drifted (the duplicate sent the event id where the script sends the plan id in reminder links). Restored the dominant shape (the cleanup-tentative-slots idiom): - scripts/waitlist/* are now import-only cores exporting sendExpirationReminders() / processWaitlistExpirations() — lock + connect-retry inside, structured results out, no side effects on import - jobs/waitlist/* are thin GH wrappers: abortIfMaintenance, GH outputs, CronLockHeldError clean-skip, exit codes - both workflows re-pointed at the jobs wrappers For the payout job (#850): rechecked and left as-is — jobs→lib-service with no scripts layer is the established NEWER convention (15 precedents: billing/*, compliance/*, contracts/*, stream/*, meetings/*) and the old wrapper's TODO #620 explicitly requested that migration; the scripts copy there was a divergent duplicate of lib, not the core. Also: master-runner mkdirs its gitignored results/reports dir instead of failing the whole run after every scenario passed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/dashboard/consultee/[consulteeId]/(features)/home/PendingPaymentsWidget.tsx (1)
67-78:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClear the stale fetch error on a successful refresh.
Once any poll fails,
errorstays non-null forever, so later successful refreshes still hit theif (error)early return and the widget never recovers until a full page reload.Suggested fix
const data = await response.json(); + setError(null); setPendingPayments(data.pendingPayments || []);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/dashboard/consultee/`[consulteeId]/(features)/home/PendingPaymentsWidget.tsx around lines 67 - 78, The fetch error is never cleared so a past failure permanently triggers the early return; in the async fetch block inside PendingPaymentsWidget, call setError(null) when the request succeeds (e.g., after parsing response and before/after setPendingPayments(data.pendingPayments || [])) so the component can recover on later successful refreshes; ensure you still setLoading(false) in the finally block and only setError in the catch branch on real failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/db/connect-retry.ts`:
- Around line 40-64: Update the inlined comment above the backoff calculation in
withDbConnectRetry to accurately describe the sleep progression given attempts
and baseMs: note that with the default attempts=3 there are only two backoff
sleeps (2s after first failure, 4s after second) and the third attempt is final
with no subsequent sleep; replace "2s/4s/8s + jitter" with a clearer phrase such
as "2s/4s with 3 attempts (exponential backoff: baseMs * 2^(attempt-1) +
jitter)" and reference the backoffMs calculation variable to make the link
obvious.
In `@lib/payments/operations/cancel-pending.ts`:
- Around line 166-173: The call to cancelPaymentIntent(paymentIntent, gateway)
is passing the PaymentGateway enum as the optional reason parameter; change it
to pass an explicit reason string (e.g., "post_commit_cancel" or
"canceled_by_system_post_commit") so upstream sees a proper cancellation reason.
Locate the call inside the gatewayCancel handling (variables paymentIntent and
gateway from gatewayCancel) and replace the second argument with the reason
string; keep the gateway enum available only if you need it for gateway-specific
logic but do not pass it as the reason.
In `@TESTING_PLAN.md`:
- Line 137: Update the env-regression assertion in TESTING_PLAN.md to match
actual credential usage: change the requirement so successful Razorpay calls
must have RAZORPAYX_KEY_SECRET or RAZORPAY_SECRET present (because
lib/payments/payouts/razorpay-payouts.ts uses RAZORPAYX_KEY_SECRET ||
RAZORPAY_SECRET || ""), and note that job-level guards in
jobs/payouts/handle-stuck-payouts.ts and jobs/payouts/reconcile-payout-status.ts
that check RAZORPAY_SECRET ?? RAZORPAY_KEY_SECRET are insufficient — the test
should not pass when only RAZORPAY_KEY_SECRET is set.
In
`@tests/typescript/race-conditions/scenarios/07-real-api-booking/test-cancel-pending-vs-webhook.ts`:
- Around line 78-79: The test mutates shared fixtures (consultation, slots,
synthetic payments) inside async function run(), but fixture restoration is only
after normal execution so failures leave the chaos fixture dirty; move the
restoration/cleanup logic into a finally block inside run() so it always runs
even if check() or a request throws, and apply the same change to the other
run() invocation later in the file (the similar block around lines referenced in
the review) to ensure both runs always revert consultation/slots/synthetic
payments and clean the shared chaos fixture.
---
Outside diff comments:
In
`@app/dashboard/consultee/`[consulteeId]/(features)/home/PendingPaymentsWidget.tsx:
- Around line 67-78: The fetch error is never cleared so a past failure
permanently triggers the early return; in the async fetch block inside
PendingPaymentsWidget, call setError(null) when the request succeeds (e.g.,
after parsing response and before/after setPendingPayments(data.pendingPayments
|| [])) so the component can recover on later successful refreshes; ensure you
still setLoading(false) in the finally block and only setError in the catch
branch on real failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 3af9dab5-9767-4b4f-b33a-5a9fcaacc590
📒 Files selected for processing (25)
.github/workflows/process-payouts.ymlTESTING_PLAN.md__tests__/enterprise/process-payouts-false-failed.test.ts__tests__/lib/connect-retry.test.ts__tests__/payments/cancel-pending-checkout.test.ts__tests__/payments/refund-operation.test.tsapp/api/checkout/pending/[paymentId]/route.tsapp/dashboard/consultee/[consulteeId]/(features)/home/PendingPaymentsWidget.tsxdocs/booking/13-cron-jobs-and-background-tasks.mddocs/payments/payouts/06-configuration.mddocs/prisma/prisma-7-migration.mdjobs/payouts/process-payouts.tslib/api/organizations/wallet.tslib/db/connect-retry.tslib/payments/operations/cancel-pending.tslib/payments/operations/refund.tslib/payments/payouts/index.tslib/payments/payouts/org-payout-service.tslib/payments/payouts/payout-service.tslib/rate-limit.tspackage.jsonscripts/payouts/process-payouts.tsscripts/waitlist/process-expired-notifications.tsscripts/waitlist/send-expiration-reminders.tstests/typescript/race-conditions/scenarios/07-real-api-booking/test-cancel-pending-vs-webhook.ts
💤 Files with no reviewable changes (1)
- scripts/payouts/process-payouts.ts
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/waitlist/process-expired-notifications.ts`:
- Around line 97-106: The callers of sendWaitlistExpiredEmail and
sendWaitlistExpiringEmail must honor their structured return ({ success:
boolean, error? }) instead of assuming success: await the call into a result
variable, check result.success — only increment emailed or set
reminderSentAt/sent when success is true; when false, do not update
counters/timestamps and push result.error (or a composed error object including
entry id/context) into the aggregated errors array so failures are recorded for
reporting. Ensure you reference the existing variables (emailed, reminderSentAt,
sent, errors) and keep the same call sites for
sendWaitlistExpiredEmail/sendWaitlistExpiringEmail while replacing the
unconditional increments/assignments with a success-branch.
- Around line 33-38: The current withDbConnectRetry call in
processWaitlistExpirations wraps processExpirationsCore so transient DB retries
can cause non-idempotent external sends to be replayed; update the flow so that
all DB reads/writes required to reserve/claim or mark a waitlist entry as "about
to notify" happen inside withDbConnectRetry (e.g., perform pre-send claim/update
in processExpirationsCore under withDbConnectRetry), and move the actual
external send/notification logic outside that retry boundary (or alternatively
implement an idempotent claim/write such as setting a notified_claim or sent_at
flag in the database before any external side effect). Keep references: modify
processWaitlistExpirations to call withDbConnectRetry only for the DB
reservation/claim steps inside processExpirationsCore and ensure external
notification sends occur after the successful DB commit so retries cannot resend
the same entry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: aafe7757-65f4-4d74-8671-18f769632567
📒 Files selected for processing (7)
.github/workflows/process-waitlist-expirations.yml.github/workflows/send-waitlist-reminders.ymljobs/waitlist/process-expired-notifications.tsjobs/waitlist/send-expiration-reminders.tsscripts/waitlist/process-expired-notifications.tsscripts/waitlist/send-expiration-reminders.tstests/typescript/race-conditions/master-runner.ts
… cancel-pending Found by the multi-agent race validation (TESTING_PLAN §4.1): 8+ truly concurrent DELETEs on one payment produced a single 500 — the Serializable loser exhausted withSerializableRetry's 3 attempts and the final write conflict (Prisma 7 client-engine 'TransactionWriteConflict' / P2034) escaped the route's catch as a server error. A concurrent writer on this payment is a conflict outcome by definition — the winner's verdict stands and the client refetches — so the route now maps the conflict shapes to 409 alongside IllegalTransitionError. Re-validated: 12 concurrent cancels → exactly one 200, nine 409, two 429 (per-user limiter), zero 5xx; payment EXPIRED once, slots released, parent CANCELLED. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Multi-agent validation round 1 — executed per TESTING_PLAN.md §3/§4DB verified already synced+seeded (83 users / 81 consultations / 106 payments / 205 waitlist rows — no reset needed). Dev server + 4 parallel Sonnet agents, with agents B and C deliberately hammering the SAME three PENDING payments. Agent A — authz/edge matrix (§2.1): 6/6 PASS. 401 unauth; 404 foreign-user (no existence leak, zero writes); 404 nonexistent; 200 happy path with slots released + parent CANCELLED + correct notes; 409 double-cancel with zero writes; second happy path clean. Agents B + C — cross-agent cancel race: PASS after one fix. 48 concurrent DELETEs across 3 shared payments, two rounds each: exactly ONE 200 per payment across both agents combined, everything else 409 (plus correct 429s when the per-user limiter engaged on round 2). DB end state: all 3 payments EXPIRED exactly once, zero tentative slots, all parents CANCELLED. Finding (fixed in 93e4cf9): 1/48 requests returned 500 — the Serializable loser exhausted withSerializableRetry and the Prisma 7 'TransactionWriteConflict' escaped as a server error; the route now maps retry-exhausted write conflicts to 409. Re-validated with a 12-way concurrent burst: Agent D — #835 refund audit + waitlist cron: ALL PASS. Full WALLET-leg refund emitted exactly one Chaos suite ( Round 2 (full re-run incl. staging-style webhook leg) happens after the terminology phases per the plan. |
…bution, send-result handling, cron retry boundary Legit findings from the Gemini/CodeRabbit review pass, each verified against current code before fixing (full triage table on the PR): - cancel-pending: gatewayCancel now returned FROM the Serializable tx — the outer-scope mutation could leak a stale value when an aborted attempt retried into an early-exit path, firing a gateway cancel for an order this run did not expire - refund step 9: the WALLET ledger posting now attributes to payment.organizationId ?? wallet-owner org (captured from walletCredit) — keying off the payment tag alone posted org-wallet-funded B2C refunds to a null-org WALLET account, guaranteeing WALLET_BALANCE_DRIFT at reconcile (the same attribution rule the #835 audit row already uses) - waitlist cores: the connect-retry now wraps ONLY a SELECT 1 warm-up — retrying the whole body could replay non-idempotent email sends; and both cores now branch on the senders' { success:false } contract instead of counting every awaited call as delivered (a failed reminder no longer stamps reminderSentAt and silently drops forever) - processOrgPayout: claimed flag surfaced through the public return so batch callers count real PENDING→PROCESSING claims, not another worker's status echo; jobs wrapper gains org_errors output so success=false is always explained (failed stays consultant-only) - jobs/waitlist wrappers: main().catch parity with the payouts wrapper - handle-stuck-payouts.yml + reconcile-payout-status.yml: RAZORPAY_SECRET mapping added — same env-name trap as #850's process-payouts fix (the payouts client never reads RAZORPAY_KEY_SECRET; the job-level gate passing alone left the client with an empty secret) - chaos scenario: fixture restoration moved into finally (early throws left the shared fixture dirty), cancellation fields restore from snapshot instead of hard-nulling, webhook-win leg asserts the parent was not cancelled; subscription-parent unit coverage added - connect-retry comment corrected (3 attempts = two sleeps); TESTING_PLAN env-regression row tightened; 'ACKed' → 'acknowledged' Declined as wrong (justifications on the threads): nullable-paymentIntent guard (schema: non-null @unique), null-targetOrgId guard (ownerOrgId is required + walletCredit throws P2025 first), cancelPaymentIntent 'reason' param claim (actual signature takes the gateway). tsc clean; 1357/1357 jest (incl. updated processOrgPayout shape assertions and 2 new subscription-parent cases). Part of #854 / #857. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@__tests__/payments/cancel-pending-checkout.test.ts`:
- Around line 294-354: Add a new test in the "cancelPendingCheckout —
subscription parent" suite that simulates the P2034 retry path: call
cancelPendingCheckout so the first attempt triggers a gatewayCancel mutation
(mock gatewayCancel to perform the gateway-side cancel and make the operation
abort/throw), then have the retry see the payment moved out of PENDING (e.g.,
update state.payments.get(...).paymentStatus to "EXPIRED" in the mock) so the
second attempt exits early with NOT_PENDING; assert that cancelPaymentIntent was
never called and that cancelPendingCheckout returns/throws the expected outcome
(no double cancel). Reference cancelPendingCheckout, gatewayCancel,
cancelPaymentIntent, IllegalTransitionError and manipulate state.payments in the
test to emulate the first-attempt mutation and the retry observation.
In `@lib/payments/payouts/org-payout-service.ts`:
- Around line 634-638: For permanent 4xx gateway rejections, stop marking the
payout as "claimed" so upstream counting doesn't treat it as advanced: in
processOrgPayout() adjust the failure return for the 4xx path (the object
currently { status: "FAILED", submittedToGateway: true, claimed: true }) to set
claimed: false (do the same for the other identical return around the second
occurrence). This ensures processPendingOrgPayouts() will not increment advanced
for permanent rejections; alternatively, if you prefer a clearer flag, add a
distinct permanentFailure boolean and update processPendingOrgPayouts() to treat
permanentFailure as not advanced—whichever approach you choose, update both
places mentioned in processOrgPayout().
In `@scripts/waitlist/process-expired-notifications.ts`:
- Around line 37-45: The job currently calls
withCronLock("process-expired-notifications", { failMode: "open" }, ...) but
that treats any lock acquisition failure (including Redis outages) as a
held-lock skip; change processWaitlistExpirations to call withCronLock inside a
try/catch that only swallows CronLockHeldError (treat as the intended skip) and
rethrows any other errors so Redis/unexpected acquisition failures don't
silently become "open" skips; reference the withCronLock call in
processWaitlistExpirations and catch the CronLockHeldError symbol specifically,
letting other errors bubble.
In
`@tests/typescript/race-conditions/scenarios/07-real-api-booking/test-cancel-pending-vs-webhook.ts`:
- Around line 259-267: The second assertion only verifies parent.requestStatus
!== "CANCELLED" and tentativeLeft === 0 elsewhere, but doesn't ensure the
webhook-confirm branch actually created confirmed slots; update the check that
asserts "leg1/B: webhook win left the parent un-cancelled" (the check call
referencing parent, tentativeLeft, confirmed) to also require confirmed > 0
(e.g., change the boolean expression to parent.requestStatus !== "CANCELLED" &&
confirmed > 0 or add an additional check) so the test fails if no confirmed
slots were created by the webhook-confirm path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 6d9859e5-2290-4742-a7d1-65563c326bc5
📒 Files selected for processing (15)
.github/workflows/handle-stuck-payouts.yml.github/workflows/reconcile-payout-status.ymlTESTING_PLAN.md__tests__/enterprise/live-payout-submission.test.ts__tests__/payments/cancel-pending-checkout.test.tsjobs/payouts/process-payouts.tsjobs/waitlist/process-expired-notifications.tsjobs/waitlist/send-expiration-reminders.tslib/db/connect-retry.tslib/payments/operations/cancel-pending.tslib/payments/operations/refund.tslib/payments/payouts/org-payout-service.tsscripts/waitlist/process-expired-notifications.tsscripts/waitlist/send-expiration-reminders.tstests/typescript/race-conditions/scenarios/07-real-api-booking/test-cancel-pending-vs-webhook.ts
| describe("cancelPendingCheckout — subscription parent", () => { | ||
| it("cancels the subscription parent through the guarded transition", async () => { | ||
| state.payments.set("pay-s", { | ||
| id: "pay-s", | ||
| userId: "user-1", | ||
| paymentStatus: "PENDING", | ||
| paymentIntent: "order_s", | ||
| paymentGateway: "RAZORPAY", | ||
| isMockPayment: true, | ||
| appointmentId: "appt-s", | ||
| consultationId: null, | ||
| subscriptionId: "sub-1", | ||
| webinarId: null, | ||
| classId: null, | ||
| }); | ||
| state.subscriptions.set("sub-1", { | ||
| id: "sub-1", | ||
| requestStatus: "APPROVED_PENDING_PAYMENT", | ||
| }); | ||
| state.slots.push({ | ||
| id: "slot-s", | ||
| appointmentId: "appt-s", | ||
| classId: null, | ||
| isTentative: true, | ||
| userIds: ["user-1"], | ||
| }); | ||
|
|
||
| const result = await cancelPendingCheckout({ | ||
| paymentId: "pay-s", | ||
| userId: "user-1", | ||
| }); | ||
|
|
||
| expect(result).toEqual({ ok: true, slotsReleased: 1 }); | ||
| expect(state.payments.get("pay-s")?.paymentStatus).toBe("EXPIRED"); | ||
| const sub = state.subscriptions.get("sub-1"); | ||
| expect(sub?.requestStatus).toBe("CANCELLED"); | ||
| expect(sub?.cancellationNotes).toBe("Cancelled by user during checkout"); | ||
| expect(sub?.cancelledAt).toBeInstanceOf(Date); | ||
| }); | ||
|
|
||
| it("rolls back when the subscription parent is already SCHEDULED", async () => { | ||
| state.payments.set("pay-s2", { | ||
| id: "pay-s2", | ||
| userId: "user-1", | ||
| paymentStatus: "PENDING", | ||
| paymentIntent: "order_s2", | ||
| paymentGateway: "RAZORPAY", | ||
| isMockPayment: true, | ||
| appointmentId: "appt-s2", | ||
| consultationId: null, | ||
| subscriptionId: "sub-2", | ||
| webinarId: null, | ||
| classId: null, | ||
| }); | ||
| state.subscriptions.set("sub-2", { id: "sub-2", requestStatus: "SCHEDULED" }); | ||
|
|
||
| await expect( | ||
| cancelPendingCheckout({ paymentId: "pay-s2", userId: "user-1" }), | ||
| ).rejects.toThrow(IllegalTransitionError); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 🏗️ Heavy lift
Add a retry-path regression for the stale gatewayCancel fix.
These new cases still drive a single successful transaction path, so they never exercise the P2034 retry sequence that this refactor is protecting. A future reintroduction of outer-scope gatewayCancel mutation would still pass here. Please add one case where the first attempt computes a gateway cancel and aborts, the retry exits NOT_PENDING, and cancelPaymentIntent stays untouched.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@__tests__/payments/cancel-pending-checkout.test.ts` around lines 294 - 354,
Add a new test in the "cancelPendingCheckout — subscription parent" suite that
simulates the P2034 retry path: call cancelPendingCheckout so the first attempt
triggers a gatewayCancel mutation (mock gatewayCancel to perform the
gateway-side cancel and make the operation abort/throw), then have the retry see
the payment moved out of PENDING (e.g., update
state.payments.get(...).paymentStatus to "EXPIRED" in the mock) so the second
attempt exits early with NOT_PENDING; assert that cancelPaymentIntent was never
called and that cancelPendingCheckout returns/throws the expected outcome (no
double cancel). Reference cancelPendingCheckout, gatewayCancel,
cancelPaymentIntent, IllegalTransitionError and manipulate state.payments in the
test to emulate the first-attempt mutation and the retry observation.
| return { | ||
| status: "FAILED" as PayoutStatus, | ||
| submittedToGateway: true, | ||
| claimed: true, | ||
| }; |
There was a problem hiding this comment.
Don't count permanent gateway rejections as successful org-payout advancement.
On the 4xx path, processOrgPayout() returns { status: "FAILED", claimed: true }, but processPendingOrgPayouts() increments advanced for any claimed result and only records thrown exceptions. That bubbles into jobs/payouts/process-payouts.ts (Lines 129-133), which only flips success=false when orgResult.errors is non-empty, so a permanently rejected org payout is currently reported as a green weekly run.
Suggested fix
for (const p of pending) {
try {
const out = await processOrgPayout(p.id);
- // Count only claims THIS run won — a status echo of a row another
- // worker already moved to PROCESSING is not our advancement.
- if (out.claimed) {
+ // Permanent submission failures must surface as job errors rather than
+ // being counted as successful advancement.
+ if (out.status === "FAILED") {
+ result.errors.push(`OrgPayout ${p.id}: permanent gateway rejection`);
+ continue;
+ }
+ // Count only claims THIS run won — a status echo of a row another
+ // worker already moved to PROCESSING is not our advancement.
+ if (out.claimed) {
result.advanced++;
}
} catch (err) {Also applies to: 821-824
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/payments/payouts/org-payout-service.ts` around lines 634 - 638, For
permanent 4xx gateway rejections, stop marking the payout as "claimed" so
upstream counting doesn't treat it as advanced: in processOrgPayout() adjust the
failure return for the 4xx path (the object currently { status: "FAILED",
submittedToGateway: true, claimed: true }) to set claimed: false (do the same
for the other identical return around the second occurrence). This ensures
processPendingOrgPayouts() will not increment advanced for permanent rejections;
alternatively, if you prefer a clearer flag, add a distinct permanentFailure
boolean and update processPendingOrgPayouts() to treat permanentFailure as not
advanced—whichever approach you choose, update both places mentioned in
processOrgPayout().
| export async function processWaitlistExpirations(): Promise<ProcessExpirationsResult> { | ||
| return withCronLock( | ||
| "process-expired-notifications", | ||
| { failMode: "open" }, | ||
| async () => { | ||
| await withDbConnectRetry(() => prisma.$queryRaw`SELECT 1`); | ||
| return processExpirationsCore(); | ||
| }, | ||
| ); |
There was a problem hiding this comment.
failMode: "open" is not actually fail-open for these waitlist jobs.
Both entrypoints rely on withCronLock(..., { failMode: "open" }), but the current lock contract throws CronLockHeldError for both a real held lock and a Redis acquisition failure. Since the jobs/waitlist/* wrappers treat that error as a clean skip, Redis outages can still suppress both jobs with exit 0 instead of running unlocked or failing loudly as unavailable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/waitlist/process-expired-notifications.ts` around lines 37 - 45, The
job currently calls withCronLock("process-expired-notifications", { failMode:
"open" }, ...) but that treats any lock acquisition failure (including Redis
outages) as a held-lock skip; change processWaitlistExpirations to call
withCronLock inside a try/catch that only swallows CronLockHeldError (treat as
the intended skip) and rethrows any other errors so Redis/unexpected acquisition
failures don't silently become "open" skips; reference the withCronLock call in
processWaitlistExpirations and catch the CronLockHeldError symbol specifically,
letting other errors bubble.
| check( | ||
| "leg1/B: webhook win confirmed the booking (no tentative residue)", | ||
| tentativeLeft === 0, | ||
| { tentativeLeft, confirmed, parent }, | ||
| ); | ||
| check( | ||
| "leg1/B: webhook win left the parent un-cancelled", | ||
| parent.requestStatus !== "CANCELLED", | ||
| parent, |
There was a problem hiding this comment.
Assert that the webhook-win branch actually creates confirmed slots.
This check only proves there is no tentative residue. A broken confirm path that deletes all slots would still pass as long as the parent is not CANCELLED. Require confirmed > 0 here as well.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@tests/typescript/race-conditions/scenarios/07-real-api-booking/test-cancel-pending-vs-webhook.ts`
around lines 259 - 267, The second assertion only verifies parent.requestStatus
!== "CANCELLED" and tentativeLeft === 0 elsewhere, but doesn't ensure the
webhook-confirm branch actually created confirmed slots; update the check that
asserts "leg1/B: webhook win left the parent un-cancelled" (the check call
referencing parent, tentativeLeft, confirmed) to also require confirmed > 0
(e.g., change the boolean expression to parent.requestStatus !== "CANCELLED" &&
confirmed > 0 or add an additional check) so the test fails if no confirmed
slots were created by the webhook-confirm path.

What
Four validated fixes from the post-#837 investigation, one commit each:
[SEAM][AUDIT] Org-wallet-funded B2C refunds are invisible to the org audit surface #835 — org-wallet refunds now visible to the org audit surface. The refund cascade's WALLET leg branch writes a
category: WALLET / action: WALLET_REFUNDOrgAuditLog row, org resolved viapayment.organizationId ?? billingAccount.ownerOrgId— the second operand is the previously-invisible path (B2C booking funded by an org wallet with no org tag on the Payment row).walletCreditnow returnsownerOrgId(already selected — zero extra queries). Step 8's INVOICE row is unchanged; org-tagged wallet payments intentionally get both the funding-level and payment-level rows.[B2C][UX] User-facing cancel-pending-booking action for tentative holds #849 — user-facing cancel-pending-booking.
DELETE /api/checkout/pending/[paymentId]+ a Cancel button on the consultee dashboard's pending-payments widget. Serializable tx, CAS claim PENDING→EXPIRED, referral-credit restore, per-type slot release mirroring the webhook's confirm scope, parent CANCELLED via the transition map with a narrow fromIn (never cancels a parent another payment confirmed), post-commit best-effort gateway cancel. Own rate-limit bucket (10/min) so cancelling never consumes checkout quota.[PAYMENTS] Consolidate scripts/payouts/process-payouts.ts onto payout-service #850 — payout disbursement consolidated onto
lib/payments/payouts;scripts/payouts/process-payouts.tsdeleted (no shim). Deliberate behavior changes for the weekly job, decided on review: TDS 194-O withholding + Redis lock + retry-count filter now apply; org payouts ride the weekly job via newprocessPendingOrgPayouts(). Hardening ported INTO the service: the Enterprise v1 mega-audit — path to full enterprise completeness (SPONSOR/HOST/HYBRID + cross-cutting + pre-MVP architecture) #776 CAS claim it lacked (plainupdatecould double-submit when two runners raced with Redis down), a silent-skip guard (empty run + APPROVED payouts waiting → exit 1, never a green no-op on a money job), and the env fix — the service readsRAZORPAYX_KEY_SECRET || RAZORPAY_SECRET, neverRAZORPAY_KEY_SECRET; without the workflow mapping the migration would have mass-failed every payout.🚨 Waitlist Expiration Job Failed #814/🚨 Waitlist Reminder Job Failed #821 — waitlist crons. Both scripts imported
CronLockHeldErrorbut never took the lock (dead catch blocks) — now wrapped inwithCronLock(failMode open, same names as the jobs wrappers). Newlib/db/connect-retry.tsretries the GH-runner→Supabase-poolerETIMEDOUTs (Prisma 7 KnownRequestError shape) with 2s/4s/8s backoff; business errors propagate immediately. The module-load Upstash crash half was already fixed by433091ee(fix: RSC boundary serialization + chaos CI skip guard #851).Plus TESTING_PLAN.md at root: route matrix with complex params, 5-agent parallel execution protocol with deliberately colliding pairs (cancel-vs-cancel, cancel-vs-webhook, last-seat storm, waitlist-expire-vs-book, idempotency replay), fixture + SQL verification snippets. Doc syncs: tentative TTL 7d→24h, payout script paths.
Tests
07-real-api-booking/test-cancel-pending-vs-webhook(auto-discovered bynpm run test:chaos:api): cancel-vs-capture-webhook with consistent end-state assertions incl. the documented late-capture orphan, plus a cancel-vs-cancel legFollow-ups (filed after merge)
jobs/waitlist/*wrappers are orphaned dead code (workflows call the scripts directly)Closes #835. Closes #849. Closes #850. Closes #814. Closes #821. Closes #856 (waitlist crons restored to the standard workflow→jobs→scripts-core shape after a convention recheck; payout job confirmed to follow the established jobs→lib-service convention with 15 precedents + TODO #620).
Part of #837.
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests