Skip to content

Allow AI crawlers in Cloudflare + Vercel firewall #13

@neonwatty

Description

@neonwatty

Problem

AI crawlers (GPTBot, ClaudeBot, ChatGPT-User, PerplexityBot) are likely being blocked by:

  1. Cloudflare AI Crawl Control — managed rules and/or managed robots.txt may be overriding origin robots.txt with Disallow directives
  2. Vercel AI Bots Managed Ruleset — if set to deny, returns 403 to all AI user agents (x-vercel-mitigated: deny header)

This means the site can't appear in ChatGPT responses, Claude answers, Perplexity search results, or Google AI Overviews.

What to do

Cloudflare (dashboard → AI → AI Audit)

  • Set "Block AI training bots" to Do not block
  • Set "Manage your robots.txt" to Disabled
  • On the Crawlers page, selectively block low-value training scrapers (Bytespider, CCBot) but keep GPTBot, ClaudeBot, ChatGPT-User, PerplexityBot set to Allow

Vercel (Firewall → Managed Rulesets)

  • Change AI Bots Managed Ruleset from deny to log

Codebase

  • Add public/llms.txt (llmstxt.org) for AI-readable site description
  • Verify robots.txt has no AI-specific Disallow directives
  • Verify sitemap includes all public pages

Verify

curl -sI -A "Mozilla/5.0 (compatible; GPTBot/1.0)" <your-domain>/
# Should return 200, not 403

Source: mean-weasel/bleep-that-shit (audit done in #620)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions