Add agent-readiness discovery metadata#4
Conversation
Make the site discoverable to AI agents: publish an MCP server card at /.well-known/mcp.json pointing to getbased-mcp, declare Content Signals in robots.txt, and advertise llms.txt + the MCP card via a site-wide Link header. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR makes
Confidence Score: 5/5Safe to merge — all three files are additive, static metadata with no logic changes and no effect on existing site behaviour. The changes add a new static JSON discovery file, one line to robots.txt, and one header entry in vercel.json. None of these touch application logic, routing, or security headers in a way that could break existing functionality. The Link header correctly uses root-relative paths so it works on both preview and production. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[AI Agent or Crawler] -->|GET request| B[getbased.health]
B -->|Link response header| C{Agent discovers}
C -->|rel=alternate| D[llms.txt]
C -->|rel=service-desc| E[.well-known/mcp.json]
E --> F[Install getbased-mcp via PyPI]
F -->|stdio transport| G[Local MCP Server]
G -->|GETBASED_TOKEN auth| H[sync.getbased.health Gateway]
H --> I[8 MCP tools available]
A -->|GET robots.txt| J[Content-Signal declared]
Reviews (2): Last reviewed commit: "Use relative URLs in Link header" | Re-trigger Greptile |
| # getbased.health robots.txt | ||
|
|
||
| # Content Signals — open-source project, all uses welcome | ||
| # search: appear in search results | ai-input: ground AI answers (RAG) | ai-train: train AI models |
There was a problem hiding this comment.
Content-Signal is placed inside the User-agent: * record (between the agent directive and the Allow/Disallow lines). Standard robots.txt parsers ignore unknown directives in a record, so crawlers are unaffected. However, if Cloudflare's Content-Signal parser treats this as a per-agent field rather than a global one, the signal may be scoped only to the wildcard agent block rather than applying site-wide. Consider moving it above the first User-agent: line or placing it after all records to make the global intent explicit and future-proof the parsing.
There was a problem hiding this comment.
Leaving this as-is. Per Cloudflare's Content Signals spec, Content-Signal is a per-record directive — it belongs inside a User-agent group alongside Allow/Disallow, not above the first User-agent: line. Placing it in the User-agent: * record is the documented pattern and applies site-wide via the wildcard. The specific bot records below (OAI-SearchBot, etc.) don't inherit it, but since every signal is yes and those bots are already fully allowed, repeating it there would just be noise.
The vercel.json comment is addressed in 89fb86d (relative URLs).
Absolute production URLs would advertise production metadata on Vercel preview deployments; relative paths resolve correctly on both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Makes getbased.health discoverable to AI agents, per the categories on isitagentready.com. The site already had
robots.txt,sitemap.xml, andllms.txt; this fills the Protocol Discovery gap and makes the existing access stance machine-readable..well-known/mcp.json(new) — MCP server card forgetbased-mcp: stdio transport, local install (getbased-mcpon PyPI or thegetbased-agent-stackbundle),GETBASED_TOKENauth, thesync.getbased.healthgateway, all 8 tools, and links to the monorepo (getbased-agents/packages/mcp) + agent-access docs.robots.txt— addedContent-Signal: search=yes, ai-input=yes, ai-train=yesunderUser-agent: *, making the existing "allow everything" stance explicit in Cloudflare's emerging format.vercel.json— added a site-wideLinkheader advertising/llms.txt(rel="alternate") and/.well-known/mcp.json(rel="service-desc").Notes
.well-known/mcp.jsonpath isn't a finalized standard yet — it's the most reasonable convention for the "MCP Server Card" check.Linkheader applies to the catch-all/(.*)rule, so it's sent on assets too — harmless; could be scoped to pages later.Test plan
curl -I https://getbased.health/shows theLinkheadercurl https://getbased.health/.well-known/mcp.jsonreturns valid JSON withContent-Type: application/jsoncurl https://getbased.health/robots.txtshows theContent-Signalline🤖 Generated with Claude Code