Skip to content

fix: suppress unintended_html_in_doc_comment for spec-prose angle brackets#215

Open
nottmey wants to merge 2 commits into
eseidel:mainfrom
nottmey:fix/unintended-html-in-doc-comment
Open

fix: suppress unintended_html_in_doc_comment for spec-prose angle brackets#215
nottmey wants to merge 2 commits into
eseidel:mainfrom
nottmey:fix/unintended-html-in-doc-comment

Conversation

@nottmey

@nottmey nottmey commented Jun 26, 2026

Copy link
Copy Markdown

Summary

Spec descriptions copy prose verbatim into dartdoc, where <x> is parsed as an HTML tag start even when it is not HTML. Backstage's catalog spec describes an entity ref as location:default/generated-<sha1hex>, and <sha1hex> reads as an (unclosed) tag start to dartdoc, tripping very_good_analysis's unintended_html_in_doc_comment on every consumer package.

Fix

Mirrors PR #138's comment_references handling: spec prose that trips dartdoc's parser should be sanitized at the generator boundary rather than forcing every consumer to hand-suppress. Adds a maybeAddUnintendedHtmlIgnore emit-time helper (analogous to maybeAddCommentReferencesIgnore) that prepends // ignore_for_file: unintended_html_in_doc_comment to any file whose /// doc comments carry an angle-bracket token dartdoc would parse as an HTML tag start.

Wired into maybeAddIgnoreDirectives alongside the existing long-line and comment_references helpers, so it runs at file-emit time on every space_gen-emitted file. Hand-written templates bypass it (same design rationale as #138).

Detection matches < immediately followed by an ASCII letter or by /+letter — the shapes dartdoc's HTML parser treats as element starts. Plain comparison prose like a < b or i < 0 does not match because the character after < is a space or digit, not a letter.

Spec that surfaced this

Backstage's catalog OpenAPI spec:

https://github.com/backstage/backstage/blob/master/plugins/catalog-backend/src/schema/openapi.yaml

Schema excerpt (Location.entityRef):

entityRef:
  type: string
  description: The entity ref of the corresponding Location kind entity, e.g. location:default/generated-<sha1hex>.

Generated dartdoc (location.dart):

/// The entity ref of the corresponding Location kind entity, e.g.
/// location:default/generated-<sha1hex>.
final String entityRef;

Before this fix: dart analyze on the generated package reports info - lib/models/location.dart:45:34 - Angle brackets will be interpreted as HTML. After this fix: No issues found!

Test plan

  • Added a maybeAddUnintendedHtmlIgnore group to test/render/formatting_test.dart — 7 tests covering: pass-through with no angle brackets, prepend on tag-like <sha1hex>, no match on // plain comments, no fire on comparison prose (a < b, i < 0), fire on closing-tag </foo>, idempotency, indented member docs.
  • dart test test/render/formatting_test.dart — all formatting tests pass.
  • dart test (full suite) — 554 tests pass, no regressions.
  • End-to-end: ran this branch's bin/space_gen.dart against the Backstage catalog spec; dart analyze on the generated package reports No issues found! (previously 1 unintended_html_in_doc_comment info).

Made with Cursor

nottmey and others added 2 commits June 26, 2026 16:41
…ckets

Spec descriptions copy prose verbatim into dartdoc, where `<x>` is parsed
as an HTML tag start even when it is not HTML. Backstage's catalog spec
describes an entity ref as `location:default/generated-<sha1hex>`, and
`<sha1hex>` reads as an (unclosed) tag start to dartdoc, tripping
`very_good_analysis`'s `unintended_html_in_doc_comment` on every consumer.

Mirrors PR eseidel#138's `comment_references` handling: spec prose that trips
dartdoc's parser should be sanitized at the generator boundary rather
than forcing every consumer to hand-suppress. Adds a
`maybeAddUnintendedHtmlIgnore` emit-time helper (analogous to
`maybeAddCommentReferencesIgnore`) that prepends
`// ignore_for_file: unintended_html_in_doc_comment` to any file whose
`///` doc comments carry an angle-bracket token dartdoc would parse as
an HTML tag start.

Detection matches `<` immediately followed by an ASCII letter or by
`/`+letter — the shapes dartdoc's HTML parser treats as element starts.
Plain comparison prose like `a < b` or `i < 0` does not match because
the character after `<` is a space or digit, not a letter.

Spec that surfaced this:
https://github.com/backstage/backstage/blob/master/plugins/catalog-backend/src/schema/openapi.yaml

Co-authored-by: Cursor <cursoragent@cursor.com>
Two follow-ups to 4a74e4e:

- Emitted `unintended_html_in_doc_comment` ignore block referenced
  Backstage-specific jargon (`entity-ref examples like
  \`location:default/generated-<sha1hex>\``), which leaks one
  consumer's vocabulary into every other consumer's generated
  files. Genericize to `placeholder tokens like \`<sha1hex>\``
  — same shape, no domain-specific wording. Mirrors the existing
  `commentReferencesIgnoreBlock` which already stays generic
  ("placeholder text, ALL_CAPS tokens, license templates"). The
  internal dartdoc on the const keeps the Backstage reference for
  maintainer provenance, matching `commentReferencesIgnoreBlock`'s
  citation of github's code-of-conduct.

- Detection regex `<(?:[A-Za-z]|/[A-Za-z])` is a verbose way to
  write `</?[A-Za-z]` — same match set, one fewer alternation.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant