Skip to content

Respect no index#8

Merged
fidalgo merged 2 commits into
mainfrom
respect-no-index
Jun 1, 2026
Merged

Respect no index#8
fidalgo merged 2 commits into
mainfrom
respect-no-index

Conversation

@fidalgo

@fidalgo fidalgo commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Carry resolved page headers and parsed HTML into link validation so
Crawlscope can avoid reporting noindexed internal targets as indexable
pages missing from the sitemap. Reuse the existing robots directive
parsing from the indexability rule for both meta robots and
X-Robots-Tag.

Also exclude form markup from the visible-text-ratio HTML denominator,
so form-heavy tool pages are not pushed toward filler copy just to
offset autocomplete payloads or hidden control markup.

How to test

Release notes

Does this change need an explicit changelog or release-note callout beyond the PR title?

Checklist

  • PR title is a Conventional Commit and suitable for squash merge
  • Tests added/updated (if applicable)
  • Docs updated (if applicable)
  • I kept changes focused and easy to review
About the maintainers

Built by Ethos Link, the team behind Reviato.
Capture. Interpret. Act.
Turn guest feedback into clear next steps for your team. Collect private appraisals, spot patterns across reviews, and act before small issues turn into public ones.

fidalgo added 2 commits June 1, 2026 10:51
  Carry resolved page headers and parsed HTML into link validation so
  Crawlscope can avoid reporting noindexed internal targets as indexable
  pages missing from the sitemap. Reuse the existing robots directive
  parsing from the indexability rule for both meta robots and
  X-Robots-Tag.

  Also exclude form markup from the visible-text-ratio HTML denominator,
  so form-heavy tool pages are not pushed toward filler copy just to
  offset autocomplete payloads or hidden control markup.
Group validation issues by category and code so large reports are easier to scan while still printing every issue. Keep issue rows compact and document the grouped output shape.

Verification: bundle exec ruby -Itest test/crawlscope/reporter_test.rb; bundle exec rake test; bundle exec rake standard
@fidalgo fidalgo merged commit 2bcbfdb into main Jun 1, 2026
4 checks passed
@fidalgo fidalgo deleted the respect-no-index branch June 1, 2026 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant