Skip to content

Add agent-readiness discovery metadata#4

Merged
elkimek merged 2 commits into
mainfrom
feat/agent-readiness
May 14, 2026
Merged

Add agent-readiness discovery metadata#4
elkimek merged 2 commits into
mainfrom
feat/agent-readiness

Conversation

@elkimek
Copy link
Copy Markdown
Owner

@elkimek elkimek commented May 14, 2026

Summary

Makes getbased.health discoverable to AI agents, per the categories on isitagentready.com. The site already had robots.txt, sitemap.xml, and llms.txt; this fills the Protocol Discovery gap and makes the existing access stance machine-readable.

  • .well-known/mcp.json (new) — MCP server card for getbased-mcp: stdio transport, local install (getbased-mcp on PyPI or the getbased-agent-stack bundle), GETBASED_TOKEN auth, the sync.getbased.health gateway, all 8 tools, and links to the monorepo (getbased-agents/packages/mcp) + agent-access docs.
  • robots.txt — added Content-Signal: search=yes, ai-input=yes, ai-train=yes under User-agent: *, making the existing "allow everything" stance explicit in Cloudflare's emerging format.
  • vercel.json — added a site-wide Link header advertising /llms.txt (rel="alternate") and /.well-known/mcp.json (rel="service-desc").

Notes

  • The .well-known/mcp.json path isn't a finalized standard yet — it's the most reasonable convention for the "MCP Server Card" check.
  • The Link header applies to the catch-all /(.*) rule, so it's sent on assets too — harmless; could be scoped to pages later.
  • Markdown content negotiation (isitagentready's "Content Accessibility") is intentionally out of scope here.

Test plan

  • After deploy, curl -I https://getbased.health/ shows the Link header
  • curl https://getbased.health/.well-known/mcp.json returns valid JSON with Content-Type: application/json
  • curl https://getbased.health/robots.txt shows the Content-Signal line

🤖 Generated with Claude Code

Make the site discoverable to AI agents: publish an MCP server card at
/.well-known/mcp.json pointing to getbased-mcp, declare Content Signals in
robots.txt, and advertise llms.txt + the MCP card via a site-wide Link header.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
get-based-site Ready Ready Preview, Comment May 14, 2026 10:51am

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR makes getbased.health discoverable to AI agents by adding three pieces of metadata: a new .well-known/mcp.json MCP server card, an explicit Content-Signal directive in robots.txt, and a Link response header in vercel.json advertising both llms.txt and the MCP card.

  • .well-known/mcp.json — new static file describing the getbased-mcp stdio server: 8 tools, token-based auth, local install via PyPI, and links to the monorepo and docs.
  • robots.txt — adds Content-Signal: search=yes, ai-input=yes, ai-train=yes inside the wildcard agent block to make the existing open-access stance explicit.
  • vercel.json — appends a site-wide Link header using root-relative paths (</llms.txt> and </.well-known/mcp.json>), which resolves correctly on both preview and production deployments.

Confidence Score: 5/5

Safe to merge — all three files are additive, static metadata with no logic changes and no effect on existing site behaviour.

The changes add a new static JSON discovery file, one line to robots.txt, and one header entry in vercel.json. None of these touch application logic, routing, or security headers in a way that could break existing functionality. The Link header correctly uses root-relative paths so it works on both preview and production.

No files require special attention.

Important Files Changed

Filename Overview
.well-known/mcp.json New MCP server card advertising the getbased-mcp package — well-structured with 8 tools, auth docs, local-install metadata, and correct relative gateway/install references. No issues found.
robots.txt Adds Content-Signal directive to the User-agent: * record making the open-access stance machine-readable; placement concern was already raised in a previous thread.
vercel.json Adds a Link response header advertising /llms.txt and /.well-known/mcp.json using correct root-relative URLs on the site-wide /(.*) rule. No issues found.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[AI Agent or Crawler] -->|GET request| B[getbased.health]
    B -->|Link response header| C{Agent discovers}
    C -->|rel=alternate| D[llms.txt]
    C -->|rel=service-desc| E[.well-known/mcp.json]
    E --> F[Install getbased-mcp via PyPI]
    F -->|stdio transport| G[Local MCP Server]
    G -->|GETBASED_TOKEN auth| H[sync.getbased.health Gateway]
    H --> I[8 MCP tools available]
    A -->|GET robots.txt| J[Content-Signal declared]
Loading

Reviews (2): Last reviewed commit: "Use relative URLs in Link header" | Re-trigger Greptile

Comment thread robots.txt
# getbased.health robots.txt

# Content Signals — open-source project, all uses welcome
# search: appear in search results | ai-input: ground AI answers (RAG) | ai-train: train AI models
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Content-Signal is placed inside the User-agent: * record (between the agent directive and the Allow/Disallow lines). Standard robots.txt parsers ignore unknown directives in a record, so crawlers are unaffected. However, if Cloudflare's Content-Signal parser treats this as a per-agent field rather than a global one, the signal may be scoped only to the wildcard agent block rather than applying site-wide. Consider moving it above the first User-agent: line or placing it after all records to make the global intent explicit and future-proof the parsing.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this as-is. Per Cloudflare's Content Signals spec, Content-Signal is a per-record directive — it belongs inside a User-agent group alongside Allow/Disallow, not above the first User-agent: line. Placing it in the User-agent: * record is the documented pattern and applies site-wide via the wildcard. The specific bot records below (OAI-SearchBot, etc.) don't inherit it, but since every signal is yes and those bots are already fully allowed, repeating it there would just be noise.

The vercel.json comment is addressed in 89fb86d (relative URLs).

Comment thread vercel.json
Absolute production URLs would advertise production metadata on Vercel
preview deployments; relative paths resolve correctly on both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@elkimek elkimek merged commit 5e61c9e into main May 14, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant