Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,21 @@ TWILIO_VALIDATE_SIGNATURE=
# Optional debug / demo
DEBUG_ENV_ENDPOINT_TOKEN=
PORTFOLIO_DEMO_MODE=
# Break-glass only: allows demo mode in production when explicitly set
ALLOW_PRODUCTION_DEMO_MODE=

# Optional rate limiting (defaults are safe for provider webhooks)
# RATE_LIMIT_WINDOW_MS=60000
# RATE_LIMIT_TWILIO_AUTH_MAX=240
# RATE_LIMIT_TWILIO_UNAUTH_MAX=40
# RATE_LIMIT_STRIPE_AUTH_MAX=240
# RATE_LIMIT_STRIPE_UNAUTH_MAX=40
# RATE_LIMIT_PROTECTED_API_MAX=80

# Optional observability + alerting
# ALERT_WEBHOOK_URL=
# ALERT_WEBHOOK_TOKEN=
# ALERT_WEBHOOK_TIMEOUT_MS=4000

# Vercel system envs (auto-set on Vercel; optional locally for fallback testing only)
# VERCEL_ENV=
Expand Down
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ When a customer calls a business's Twilio number and the forwarded call is misse
- SMS compliance commands (`STOP` / `START` / `HELP`) with DB-backed opt-out state
- Call recording enabled on forwarded calls + recording metadata captured on callbacks
- Twilio webhook protection: production-enforced `X-Twilio-Signature` validation, with shared-token fallback only in non-production
- Webhook observability baseline: correlation IDs (`X-Correlation-Id`), centralized `app.error` reporting, optional alert webhook dispatch
- Production guardrail: `PORTFOLIO_DEMO_MODE` is blocked in production unless `ALLOW_PRODUCTION_DEMO_MODE=true` is explicitly set

## Local Setup

Expand Down Expand Up @@ -64,6 +66,7 @@ Required categories:
- Stripe keys + price IDs + webhook secret
- Twilio credentials + webhook auth token
- Database URL
- Optional rate-limit tuning vars (defaults are built in)

### 4. Run Prisma migrations / generate client

Expand Down Expand Up @@ -249,6 +252,7 @@ Compliance handling:
Security / idempotency notes:

- Invalid webhook token -> `401`
- Unauthorized webhook bursts are rate-limited with `429` (`Retry-After` + `X-RateLimit-*` headers)
- Duplicate inbound SMS retries with the same `MessageSid` are deduped via `Message.twilioSid` and ignored after persistence check
- Webhook handlers log structured events (`callSid` / `messageSid`, event type, decision)

Expand Down Expand Up @@ -302,6 +306,9 @@ Prisma models included:
## Useful Routes

- `/` - landing page
- `/terms` - terms of service
- `/privacy` - privacy policy
- `/refund` - refund policy
- `/sign-in` - Clerk sign-in
- `/sign-up` - Clerk sign-up
- `/app/onboarding` - create business record
Expand Down
20 changes: 19 additions & 1 deletion RUNBOOK.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,14 @@
7. Verify Stripe webhook endpoint still points to the correct production URL.
8. Run a live Twilio smoke test (call + missed call + SMS reply + STOP/START).

## Backup + Restore

- Canonical procedure: `docs/BACKUP_RESTORE_RUNBOOK.md`
- Minimum policy:
- Neon PITR enabled for production.
- Logical backup artifacts retained for 30+ days.
- Restore drill executed monthly with recorded evidence.

## Rotate `TWILIO_WEBHOOK_AUTH_TOKEN` (shared webhook token)

1. Generate a new random token (do not reuse old values).
Expand All @@ -42,13 +50,24 @@
- Vercel runtime logs:
- API route logs for `/api/twilio/voice`, `/api/twilio/status`, `/api/twilio/sms`
- Look for structured prefixes: `twilio.voice`, `twilio.status`, `twilio.sms`, `twilio.messaging`, `twilio.webhook-auth`
- Look for centralized error events: `app.error` (includes `correlationId`, `source`, `event`, and metadata)
- Twilio Console:
- Phone Number webhook logs / Debugger
- Call Logs and Recordings
- Messaging logs
- Neon:
- Query activity / connection issues (if DB errors occur)

## Observability + Alerts

- Every Twilio/Stripe webhook response now includes `X-Correlation-Id`.
- For incident triage, capture the correlation ID from provider delivery logs and search Vercel logs for that ID.
- Optional alert wiring:
1. Set `ALERT_WEBHOOK_URL` in Vercel (Slack/PagerDuty/incident gateway endpoint).
2. Optionally set `ALERT_WEBHOOK_TOKEN` if your endpoint requires bearer auth.
3. Optionally set `ALERT_WEBHOOK_TIMEOUT_MS` (default `4000`).
4. Redeploy and induce a safe synthetic webhook failure in non-production to confirm alert delivery.

## Common Failure Modes

- Twilio webhooks return `401`
Expand All @@ -67,4 +86,3 @@
- `DATABASE_URL` / `DIRECT_DATABASE_URL` swapped
- Missing `sslmode=require`
- `DIRECT_DATABASE_URL` accidentally using Neon pooler host

71 changes: 66 additions & 5 deletions app/api/stripe/webhook/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@ import { NextResponse } from 'next/server';
import Stripe from 'stripe';

import { db } from '@/lib/db';
import { getCorrelationIdFromRequest, reportApplicationError, withCorrelationIdHeader } from '@/lib/observability';
import { RATE_LIMIT_STRIPE_AUTH_MAX, RATE_LIMIT_STRIPE_UNAUTH_MAX, RATE_LIMIT_WINDOW_MS } from '@/lib/rate-limit-config';
import { buildRateLimitHeaders, consumeRateLimit, getClientIpAddress } from '@/lib/rate-limit';
import { getStripe } from '@/lib/stripe';

export const runtime = 'nodejs';
Expand Down Expand Up @@ -82,10 +85,13 @@ async function handleCheckoutCompleted(session: Stripe.Checkout.Session) {
}

export async function POST(request: Request) {
const clientIp = getClientIpAddress(request);
const correlationId = getCorrelationIdFromRequest(request);
const withCorrelation = (response: NextResponse) => withCorrelationIdHeader(response, correlationId);
const signature = request.headers.get('stripe-signature');
const webhookSecret = process.env.STRIPE_WEBHOOK_SECRET;
if (!signature || !webhookSecret) {
return NextResponse.json({ error: 'Missing Stripe webhook configuration' }, { status: 400 });
return withCorrelation(NextResponse.json({ error: 'Missing Stripe webhook configuration' }, { status: 400 }));
}

const payload = await request.text();
Expand All @@ -95,8 +101,54 @@ export async function POST(request: Request) {
try {
event = stripe.webhooks.constructEvent(payload, signature, webhookSecret);
} catch (error) {
const unauthRateLimit = consumeRateLimit({
key: `stripe:webhook:unauth:${clientIp}`,
limit: RATE_LIMIT_STRIPE_UNAUTH_MAX,
windowMs: RATE_LIMIT_WINDOW_MS,
});
if (!unauthRateLimit.allowed) {
console.warn('Stripe webhook rate-limited (invalid signature burst)', {
clientIp,
correlationId,
decision: 'reject_429',
});
return withCorrelation(
NextResponse.json(
{ error: 'Too many invalid webhook attempts' },
{ status: 429, headers: buildRateLimitHeaders(unauthRateLimit) }
)
);
}

const message = error instanceof Error ? error.message : 'Invalid webhook signature';
return NextResponse.json({ error: message }, { status: 400 });
reportApplicationError({
source: 'stripe.webhook',
event: 'invalid_signature',
correlationId,
error,
alert: false,
metadata: {
clientIp,
},
});
return withCorrelation(NextResponse.json({ error: message }, { status: 400 }));
}

const authRateLimit = consumeRateLimit({
key: `stripe:webhook:auth:${clientIp}`,
limit: RATE_LIMIT_STRIPE_AUTH_MAX,
windowMs: RATE_LIMIT_WINDOW_MS,
});
if (!authRateLimit.allowed) {
console.warn('Stripe webhook rate-limited', {
clientIp,
correlationId,
eventType: event.type,
decision: 'reject_429',
});
return withCorrelation(
NextResponse.json({ error: 'Rate limit exceeded' }, { status: 429, headers: buildRateLimitHeaders(authRateLimit) })
);
}

try {
Expand Down Expand Up @@ -135,9 +187,18 @@ export async function POST(request: Request) {
break;
}
} catch (error) {
console.error('Stripe webhook handler error', error);
return NextResponse.json({ error: 'Webhook processing failed' }, { status: 500 });
reportApplicationError({
source: 'stripe.webhook',
event: 'handler_error',
correlationId,
error,
metadata: {
clientIp,
eventType: event.type,
},
});
return withCorrelation(NextResponse.json({ error: 'Webhook processing failed' }, { status: 500 }));
}

return NextResponse.json({ received: true });
return withCorrelation(NextResponse.json({ received: true }));
}
Loading