Problem
backlinks is Common-Crawl-only today (per-project autoExtractBacklinks + cc_release_syncs, via @ainyc/canonry-integration-commoncrawl). Common Crawl publishes roughly monthly with additional processing lag, which is too low-frequency for timely backlink monitoring or link-acquisition verification.
Google Search Console exposes no links API (the Links report is UI-only). Bing Webmaster Tools is the only programmatic, faster-refreshing first-party link source, and integration-bing currently wires only GetUserSites / GetUrlInfo / GetQueryStats / GetCrawlStats / GetCrawlIssues / SubmitUrl*. There are no link endpoints.
Proposal
Add Bing Webmaster inbound links as a second backlinks source and make the backlinks surface source-aware (commoncrawl | bing-webmaster) end to end (API → CLI → MCP → UI), so an operator can monitor inbound links from whichever source they have set up.
Scope by layer
integration-bing (packages/integration-bing/src/bing-client.ts)
- Add
getUrlLinks() / getLinkCounts() calling Bing GetUrlLinks / GetLinkCounts; add BingInboundLink / BingLinkCount types.
backlinks model (packages/api-routes/src/backlinks.ts, packages/contracts/src/backlinks.ts, packages/db)
- Add a
source discriminator (commoncrawl | bing-webmaster) on backlink rows, plus a migration (per the schema-change rule).
- Ingest Bing inbound links into the same store; leave Common Crawl ingestion unchanged.
backlinks-sync schedule (packages/contracts/src/schedule.ts + scheduler)
- Sync from every source the project has set up (or accept a
--source filter). Common Crawl stays release-driven; Bing pulls live from the connected account.
API / CLI / MCP
source filter + per-source counts on the existing backlinks read endpoints; new response shapes get Zod schemas + jsonResponse (spec-driven typing). MCP parity per the default rule. Maintain UI/CLI parity (any metric shown in the UI is in the CLI/API response).
UI (apps/web/src/pages/BacklinksPage.tsx, apps/web/src/components/project/BacklinksSection.tsx)
- Source badge / filter on the links table; per-source freshness note (Common Crawl ~monthly vs Bing live).
- Instructive empty / onboarding state when no source is set up (per the empty-state rule).
Connection-awareness (important)
The two sources have different setup affordances; the UI/API must reflect availability and never error when a source is absent:
- Common Crawl: public dataset, no auth. "Available" =
autoExtractBacklinks enabled + a synced release. Affordance = enable the toggle / run a release sync.
- Bing Webmaster: requires a connected Bing Webmaster account (
bing_connections, per domain). Affordance = connect Bing.
Backlinks reads should report which sources are available / connected and degrade gracefully across CC-only / Bing-only / both / neither (neither → the onboarding empty state). Add a backlinks.source.connected doctor check (warn when a project has neither source set up).
Out of scope
- GSC URL Inspection
referringUrls (already stored per-URL from inspections; a separate per-page signal, not a backlink index).
- Paid third-party link APIs (Ahrefs / Majestic / Semrush).
Acceptance
- Bing inbound links appear in
backlinks via API, CLI (--format json), MCP, and the UI, tagged source=bing-webmaster, alongside Common Crawl.
- The backlinks section shows an instructive empty state when neither source is set up, and per-source availability when one is.
- Tests cover the Bing client methods, the source discriminator + ingestion, and the source-availability logic.
Problem
backlinksis Common-Crawl-only today (per-projectautoExtractBacklinks+cc_release_syncs, via@ainyc/canonry-integration-commoncrawl). Common Crawl publishes roughly monthly with additional processing lag, which is too low-frequency for timely backlink monitoring or link-acquisition verification.Google Search Console exposes no links API (the Links report is UI-only). Bing Webmaster Tools is the only programmatic, faster-refreshing first-party link source, and
integration-bingcurrently wires onlyGetUserSites/GetUrlInfo/GetQueryStats/GetCrawlStats/GetCrawlIssues/SubmitUrl*. There are no link endpoints.Proposal
Add Bing Webmaster inbound links as a second backlinks source and make the backlinks surface source-aware (
commoncrawl | bing-webmaster) end to end (API → CLI → MCP → UI), so an operator can monitor inbound links from whichever source they have set up.Scope by layer
integration-bing (
packages/integration-bing/src/bing-client.ts)getUrlLinks()/getLinkCounts()calling BingGetUrlLinks/GetLinkCounts; addBingInboundLink/BingLinkCounttypes.backlinks model (
packages/api-routes/src/backlinks.ts,packages/contracts/src/backlinks.ts,packages/db)sourcediscriminator (commoncrawl | bing-webmaster) on backlink rows, plus a migration (per the schema-change rule).backlinks-sync schedule (
packages/contracts/src/schedule.ts+ scheduler)--sourcefilter). Common Crawl stays release-driven; Bing pulls live from the connected account.API / CLI / MCP
sourcefilter + per-source counts on the existing backlinks read endpoints; new response shapes get Zod schemas +jsonResponse(spec-driven typing). MCP parity per the default rule. Maintain UI/CLI parity (any metric shown in the UI is in the CLI/API response).UI (
apps/web/src/pages/BacklinksPage.tsx,apps/web/src/components/project/BacklinksSection.tsx)Connection-awareness (important)
The two sources have different setup affordances; the UI/API must reflect availability and never error when a source is absent:
autoExtractBacklinksenabled + a synced release. Affordance = enable the toggle / run a release sync.bing_connections, per domain). Affordance = connect Bing.Backlinks reads should report which sources are available / connected and degrade gracefully across CC-only / Bing-only / both / neither (neither → the onboarding empty state). Add a
backlinks.source.connecteddoctor check (warn when a project has neither source set up).Out of scope
referringUrls(already stored per-URL from inspections; a separate per-page signal, not a backlink index).Acceptance
backlinksvia API, CLI (--format json), MCP, and the UI, taggedsource=bing-webmaster, alongside Common Crawl.