feat(mail): enforce HTML lint for compose paths#735
feat(mail): enforce HTML lint for compose paths#735LiLithium wants to merge 1 commit intolarksuite:mainfrom
Conversation
📝 WalkthroughWalkthroughThis PR introduces an HTML linting system for the mail package. It adds a new Changes
Sequence DiagramsequenceDiagram
participant Client
participant MailShortcut
participant HTMLLinter
participant MailBackend
participant Output
Client->>MailShortcut: Compose mail with HTML body
MailShortcut->>MailShortcut: Detect HTML content
MailShortcut->>HTMLLinter: lintHTMLBeforeWrite(htmlBody)
HTMLLinter->>HTMLLinter: Parse & traverse nodes
HTMLLinter->>HTMLLinter: Sanitize (tags, attrs, URLs, CSS)
HTMLLinter->>HTMLLinter: Downgrade legacy tags
HTMLLinter-->>MailShortcut: Return Result (warnings, errors, cleanedHTML)
alt Lint Errors Present
MailShortcut->>Output: Return error (abort execution)
else No Errors
MailShortcut->>MailShortcut: Update body with cleanedHTML
MailShortcut->>MailBackend: Write sanitized draft/mail
MailBackend-->>MailShortcut: Confirm write
MailShortcut->>Output: Include lint report (lint_applied, original_blocked)
Output-->>Client: Success with lint findings
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
There was a problem hiding this comment.
Pull request overview
This PR introduces an HTML linting/sanitization layer for the mail compose/edit shortcuts, plus a standalone +lint-html command and supporting reference docs, to improve Feishu editor compatibility and block unsafe HTML.
Changes:
- Add
+lint-htmlshortcut for local HTML linting/autofix and document its output contract. - Enforce HTML linting in compose/write paths (
+send,+reply,+reply-all,+forward,+draft-create,+draft-edit) and surface findings vialint_applied/original_blocked. - Add Feishu-compatible HTML guidance docs (native style templates + compatibility whitelist).
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| skills/lark-mail/references/lark-mail-native-style.md | Adds Feishu-native HTML writing templates for common mail types. |
| skills/lark-mail/references/lark-mail-lint-html.md | Documents the new +lint-html shortcut usage and output contract. |
| skills/lark-mail/references/lark-mail-html-compat.md | Documents the HTML compatibility whitelist/blocklist and inline image path safety. |
| skills/lark-mail/SKILL.md | Updates skill rules and shortcut list to reference HTML linting + templates/whitelist. |
| shortcuts/mail/shortcuts.go | Registers the new MailLintHTML shortcut. |
| shortcuts/mail/mail_lint_html.go | Implements +lint-html shortcut. |
| shortcuts/mail/htmllint/lint.go | Introduces the HTML lint/sanitize engine (tag/attr/style/url handling). |
| shortcuts/mail/helpers.go | Adds lintHTMLBeforeWrite and addLintReport helpers and wires lint report into outputs. |
| shortcuts/mail/mail_send.go | Runs lint before writing HTML drafts/sends and includes lint report in outputs. |
| shortcuts/mail/mail_reply.go | Runs lint for reply HTML compose path and includes lint report in outputs. |
| shortcuts/mail/mail_reply_all.go | Runs lint for reply-all HTML compose path and includes lint report in outputs. |
| shortcuts/mail/mail_forward.go | Runs lint for forward HTML compose path and includes lint report in outputs. |
| shortcuts/mail/mail_draft_create.go | Runs lint for draft-create HTML bodies and includes lint report in outputs. |
| shortcuts/mail/mail_draft_edit.go | Runs lint across body-related patch ops and includes lint report in outputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if strings.EqualFold(strings.TrimSpace(os.Getenv("LARK_CLI_MAIL_LINT_MODE")), "warn-only") { | ||
| return raw, result, nil | ||
| } | ||
| if result.CleanedHTML != "" || result.HasFindings() { |
There was a problem hiding this comment.
lintHTMLBeforeWrite currently returns result.CleanedHTML whenever it is non-empty. Since htmllint.Lint(autoFix=true) always renders HTML into CleanedHTML for any non-empty input, this will rewrite/canonicalize all HTML even when there are no findings. Consider only returning CleanedHTML when it differs from the input or when findings require a change, otherwise return the original raw string to avoid unexpected formatting/escaping changes.
| if result.CleanedHTML != "" || result.HasFindings() { | |
| if result.CleanedHTML != "" && (result.HasFindings() || result.CleanedHTML != raw) { |
| var lintReport htmllint.Result | ||
| if !input.PlainText && bodyIsHTML(input.Body) { | ||
| input.Body, lintReport, err = lintHTMLBeforeWrite(input.Body) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
In +draft-create, linting is only applied when the incoming body is already HTML. However buildRawEMLForDraftCreate upgrades plain text to HTML when a signature is requested (sigResult != nil), which means that write path can bypass linting entirely. Consider moving linting to the point where the final HTML body is assembled (after the plain-text→HTML upgrade and signature injection), or broadening the condition to also lint when sigResult is non-nil.
| out.WriteByte('>') | ||
| renderChildren(out, n, result, autoFix) | ||
| out.WriteString("</") | ||
| out.WriteString(tag) | ||
| out.WriteByte('>') |
There was a problem hiding this comment.
htmllint.renderNode unconditionally emits a closing tag for every element. This will produce invalid HTML like <br></br> / <img ...></img> for void elements, which may reduce compatibility with mail clients / Feishu editor. Consider special-casing void elements (br/hr/img/...) to render without an end tag (or use x/net/html's renderer on a rewritten node).
| "ok": ok, | ||
| "data": result, | ||
| } | ||
| runtime.Out(out, nil) |
There was a problem hiding this comment.
This shortcut returns cleaned_html (HTML content) inside JSON, but uses runtime.Out which HTML-escapes JSON by default. That will turn </> into \u003c/\u003e, making cleaned_html hard to reuse. Consider using runtime.OutRaw / OutFormatRaw for this command so HTML is preserved verbatim in JSON output.
| runtime.Out(out, nil) | |
| runtime.OutRaw(out, nil) |
| } | ||
| runtime.Out(out, nil) | ||
| if !ok { | ||
| return output.ErrValidation("mail HTML lint found incompatible content") |
There was a problem hiding this comment.
The command writes a success JSON envelope to stdout via runtime.Out, then returns output.ErrValidation when findings are present. The framework will emit an additional JSON error envelope, resulting in double/contradictory output. If the intent is to keep the structured lint report on stdout while only changing the exit code, return output.ErrBare(output.ExitValidation) (similar to shortcuts/drive/+push), or avoid emitting stdout output on the error path.
| return output.ErrValidation("mail HTML lint found incompatible content") | |
| return output.ErrBare(output.ExitValidation) |
| func Lint(raw string, autoFix bool) (Result, error) { | ||
| result := Result{} | ||
| nodes, err := xhtml.ParseFragment(strings.NewReader(raw), &xhtml.Node{Type: xhtml.ElementNode, Data: "body"}) | ||
| if err != nil { | ||
| result.Errors = append(result.Errors, Finding{ |
There was a problem hiding this comment.
New HTML sanitization/linting logic is introduced here (tag conversions, blocked tags, URL/style filtering), but there are no accompanying unit tests in this PR. Since shortcuts/mail already has extensive test coverage, adding tests for key cases (void tags, blocked tags/attrs, javascript: URLs, style property filtering, parse failure) would help prevent regressions.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/mail/htmllint/lint.go`:
- Around line 32-53: The PR adds a new lint engine (Lint, renderNode, Result,
Finding) with changed behavior for HTML parse handling, tag/attr stripping, CSS
sanitization and auto-fix output but no tests—add regression tests that cover:
1) parsing failure path (simulate invalid HTML -> Lint returns Finding with
RuleID "HTML_PARSE_FAILED", Severity "error", Excerpt/truncate and CleanedHTML
behavior when autoFix true/false); 2) normal parsing path (valid fragment ->
renderNode produces expected Result.Errors and Result.CleanedHTML for both
autoFix=true/false); 3) core rule behaviors (tag/attr stripping, CSS
sanitization) with representative inputs and expected cleaned outputs; and 4)
edge cases mentioned in the diff (see functions and logic between lines ~55-238)
to ensure stable behavior; implement these as unit tests asserting RuleID,
Severity, TagOrAttr, Excerpt, Hint, and exact CleanedHTML so future regressions
are caught.
- Around line 197-203: The current unsafeURL function incorrectly rejects safe
URLs by using strings.Contains for "javascript:" and "vbscript:"; update
unsafeURL to parse the input (use net/url.Parse on the trimmed, lowercase raw)
and only treat the URL as unsafe when its scheme equals "javascript" or
"vbscript" (or when the raw string has a leading "javascript:"/ "vbscript:"
prefix after trimming). Remove the strings.Contains checks and keep only an
explicit scheme/prefix check so links like
"https://example.com/docs/javascript:guide" are not rejected; refer to the
unsafeURL function and the raw parameter to locate and change the logic.
In `@shortcuts/mail/mail_draft_create.go`:
- Around line 167-173: The lint gate currently only lints when
bodyIsHTML(input.Body) is true, which misses cases where input.PlainText is true
but signature insertion (sigResult) produces an HTML compose; update the logic
to run lintHTMLBeforeWrite on the final composed body (i.e., after signature
insertion) or change the condition to also check for sigResult/signature HTML so
that lintHTMLBeforeWrite is executed when the resulting message is HTML; update
the use sites around bodyIsHTML, lintHTMLBeforeWrite, input.PlainText and
sigResult in mail_draft_create.go accordingly and add a regression test covering
a plain-text input with sigResult != nil that results in an HTML body to ensure
the path is covered.
In `@shortcuts/mail/mail_lint_html.go`:
- Line 22: Update the AuthTypes slice used in the shortcut registration: replace
the invalid identity "tenant" with the supported "bot" so AuthTypes reads
{"user","bot"}; locate the AuthTypes field in mail_lint_html.go (the struct
literal where AuthTypes is set) and change the value to use "bot" to match the
supported identities defined in shortcuts/common/types.go and be consistent with
other shortcuts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 72bfb620-2349-467a-8120-bff1fad5eae2
📒 Files selected for processing (14)
shortcuts/mail/helpers.goshortcuts/mail/htmllint/lint.goshortcuts/mail/mail_draft_create.goshortcuts/mail/mail_draft_edit.goshortcuts/mail/mail_forward.goshortcuts/mail/mail_lint_html.goshortcuts/mail/mail_reply.goshortcuts/mail/mail_reply_all.goshortcuts/mail/mail_send.goshortcuts/mail/shortcuts.goskills/lark-mail/SKILL.mdskills/lark-mail/references/lark-mail-html-compat.mdskills/lark-mail/references/lark-mail-lint-html.mdskills/lark-mail/references/lark-mail-native-style.md
| func Lint(raw string, autoFix bool) (Result, error) { | ||
| result := Result{} | ||
| nodes, err := xhtml.ParseFragment(strings.NewReader(raw), &xhtml.Node{Type: xhtml.ElementNode, Data: "body"}) | ||
| if err != nil { | ||
| result.Errors = append(result.Errors, Finding{ | ||
| RuleID: "HTML_PARSE_FAILED", Severity: "error", TagOrAttr: "html", | ||
| Excerpt: truncate(raw), Hint: "HTML cannot be parsed reliably", | ||
| }) | ||
| if autoFix { | ||
| result.CleanedHTML = raw | ||
| } | ||
| return result, nil | ||
| } | ||
| var out bytes.Buffer | ||
| for _, n := range nodes { | ||
| renderNode(&out, n, &result, autoFix) | ||
| } | ||
| if autoFix { | ||
| result.CleanedHTML = out.String() | ||
| } | ||
| return result, nil | ||
| } |
There was a problem hiding this comment.
Add regression tests for the new lint engine before merge
This introduces substantial behavior changes (HTML parse handling, tag/attr stripping, CSS sanitization, auto-fix output) but no accompanying tests are present in the provided changes. Please add coverage for core rules and edge cases before release.
As per coding guidelines "Every behavior change must have an accompanying test".
Also applies to: 55-238
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@shortcuts/mail/htmllint/lint.go` around lines 32 - 53, The PR adds a new lint
engine (Lint, renderNode, Result, Finding) with changed behavior for HTML parse
handling, tag/attr stripping, CSS sanitization and auto-fix output but no
tests—add regression tests that cover: 1) parsing failure path (simulate invalid
HTML -> Lint returns Finding with RuleID "HTML_PARSE_FAILED", Severity "error",
Excerpt/truncate and CleanedHTML behavior when autoFix true/false); 2) normal
parsing path (valid fragment -> renderNode produces expected Result.Errors and
Result.CleanedHTML for both autoFix=true/false); 3) core rule behaviors
(tag/attr stripping, CSS sanitization) with representative inputs and expected
cleaned outputs; and 4) edge cases mentioned in the diff (see functions and
logic between lines ~55-238) to ensure stable behavior; implement these as unit
tests asserting RuleID, Severity, TagOrAttr, Excerpt, Hint, and exact
CleanedHTML so future regressions are caught.
| func unsafeURL(raw string) bool { | ||
| s := strings.ToLower(strings.TrimSpace(raw)) | ||
| return strings.HasPrefix(s, "javascript:") || | ||
| strings.HasPrefix(s, "vbscript:") || | ||
| strings.Contains(s, "javascript:") || | ||
| strings.Contains(s, "vbscript:") | ||
| } |
There was a problem hiding this comment.
unsafeURL is overbroad and blocks safe links
Using strings.Contains(..., "javascript:") and strings.Contains(..., "vbscript:") will reject safe URLs like https://example.com/docs/javascript:guide. This can strip legitimate href/src values.
Suggested fix
func unsafeURL(raw string) bool {
s := strings.ToLower(strings.TrimSpace(raw))
- return strings.HasPrefix(s, "javascript:") ||
- strings.HasPrefix(s, "vbscript:") ||
- strings.Contains(s, "javascript:") ||
- strings.Contains(s, "vbscript:")
+ if strings.HasPrefix(s, "javascript:") || strings.HasPrefix(s, "vbscript:") {
+ return true
+ }
+ compact := strings.ReplaceAll(s, " ", "")
+ return strings.Contains(compact, "url(javascript:") ||
+ strings.Contains(compact, "url(vbscript:")
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@shortcuts/mail/htmllint/lint.go` around lines 197 - 203, The current
unsafeURL function incorrectly rejects safe URLs by using strings.Contains for
"javascript:" and "vbscript:"; update unsafeURL to parse the input (use
net/url.Parse on the trimmed, lowercase raw) and only treat the URL as unsafe
when its scheme equals "javascript" or "vbscript" (or when the raw string has a
leading "javascript:"/ "vbscript:" prefix after trimming). Remove the
strings.Contains checks and keep only an explicit scheme/prefix check so links
like "https://example.com/docs/javascript:guide" are not rejected; refer to the
unsafeURL function and the raw parameter to locate and change the logic.
| var lintReport htmllint.Result | ||
| if !input.PlainText && bodyIsHTML(input.Body) { | ||
| input.Body, lintReport, err = lintHTMLBeforeWrite(input.Body) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| } |
There was a problem hiding this comment.
Lint gate misses signature-triggered HTML compose path.
On Line 168, lint only runs when bodyIsHTML(input.Body) is true. If body is plain text but signature insertion causes HTML compose, this write path skips lint.
💡 Proposed fix
- var lintReport htmllint.Result
- if !input.PlainText && bodyIsHTML(input.Body) {
- input.Body, lintReport, err = lintHTMLBeforeWrite(input.Body)
- if err != nil {
- return err
- }
- }
sigResult, err := resolveSignature(ctx, runtime, mailboxID, runtime.Str("signature-id"), runtime.Str("from"))
if err != nil {
return err
}
+ var lintReport htmllint.Result
+ if !input.PlainText && (bodyIsHTML(input.Body) || sigResult != nil) {
+ input.Body, lintReport, err = lintHTMLBeforeWrite(input.Body)
+ if err != nil {
+ return err
+ }
+ }Please add a regression test for plain-text body + signature (sigResult != nil) to keep this path covered.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@shortcuts/mail/mail_draft_create.go` around lines 167 - 173, The lint gate
currently only lints when bodyIsHTML(input.Body) is true, which misses cases
where input.PlainText is true but signature insertion (sigResult) produces an
HTML compose; update the logic to run lintHTMLBeforeWrite on the final composed
body (i.e., after signature insertion) or change the condition to also check for
sigResult/signature HTML so that lintHTMLBeforeWrite is executed when the
resulting message is HTML; update the use sites around bodyIsHTML,
lintHTMLBeforeWrite, input.PlainText and sigResult in mail_draft_create.go
accordingly and add a regression test covering a plain-text input with sigResult
!= nil that results in an HTML body to ensure the path is covered.
| Command: "+lint-html", | ||
| Description: "Check and optionally autofix mail HTML against Feishu-compatible rules without writing mailbox state.", | ||
| Risk: "read", | ||
| AuthTypes: []string{"user", "tenant"}, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) Locate AuthTypes contract and validation logic"
rg -n --type=go -C3 'type Shortcut struct|AuthTypes|unsupported auth|auth type|tenant|bot|user' shortcuts/common shortcuts internal cmd
echo
echo "2) Find all shortcuts currently using AuthTypes with tenant"
rg -n --type=go -C2 'AuthTypes:\s*\[\]string\{[^}]*"tenant"' shortcuts
echo
echo "3) Inspect command registration path for allowed AuthTypes enforcement"
rg -n --type=go -C4 'AuthTypes|Validate.*Auth|supported identities|tenant' shortcutsRepository: larksuite/cli
Length of output: 50371
🏁 Script executed:
# Find the Shortcut struct definition
fd -t f -name '*.go' shortcuts/common | head -5 | xargs -I {} grep -l "type Shortcut struct" {}
# Or directly search for Shortcut struct
rg -n "type Shortcut struct" shortcuts/common --type=go -A 20 | head -50Repository: larksuite/cli
Length of output: 2172
🏁 Script executed:
# Find any AuthTypes field usage with "tenant"
rg -n 'AuthTypes.*tenant' shortcuts/ --type=goRepository: larksuite/cli
Length of output: 135
🏁 Script executed:
# Check mail_lint_html.go directly for context
cat -n shortcuts/mail/mail_lint_html.go | sed -n '1,30p'Repository: larksuite/cli
Length of output: 1387
🏁 Script executed:
# Find all shortcuts using "tenant" in AuthTypes
rg -n 'AuthTypes.*\[.*"tenant"' shortcuts/ --type=goRepository: larksuite/cli
Length of output: 135
🏁 Script executed:
# Search for AuthTypes validation/enforcement logic
rg -n "AuthTypes|supported.*identit|tenant" shortcuts/common --type=go -B 2 -A 2 | head -80Repository: larksuite/cli
Length of output: 5453
🏁 Script executed:
# Check if there's any runtime validation of AuthTypes values
rg -n "AuthTypes" internal/ cmd/ --type=go -B 2 -A 2 | head -100Repository: larksuite/cli
Length of output: 6542
🏁 Script executed:
# Find mail_lint_html tests
fd -t f -name '*mail*lint*test*' shortcuts/mail/Repository: larksuite/cli
Length of output: 288
🏁 Script executed:
# Check if mail tests use --as bot
rg -n "mail.*lint|MailLintHTML" shortcuts/mail/ --type=go -lRepository: larksuite/cli
Length of output: 335
🏁 Script executed:
# Look for mail_lint_html in test files
rg -n "MailLintHTML\|mail.*lint" tests/ --type=go -B 2 -A 5 2>/dev/null | head -50Repository: larksuite/cli
Length of output: 39
🏁 Script executed:
# Check shortcuts.go to see how MailLintHTML is registered
cat shortcuts/mail/shortcuts.goRepository: larksuite/cli
Length of output: 617
🏁 Script executed:
# Look for any E2E tests for mail
fd -t f -name '*test.go' tests/cli_e2e/ | xargs grep -l "mail" | head -3Repository: larksuite/cli
Length of output: 288
🏁 Script executed:
# Search for how other shortcuts use dual user/bot AuthTypes for reference
rg -n 'AuthTypes.*\[\]string\{"user".*"bot"\}' shortcuts/ --type=go -B 3Repository: larksuite/cli
Length of output: 47910
Change AuthTypes from {"user", "tenant"} to {"user", "bot"}
Line 22 declares AuthTypes: []string{"user", "tenant"}, but the shortcut framework only supports "user" and "bot" (see shortcuts/common/types.go:41). "tenant" is not a recognized identity in this framework and will be silently ignored. Every other shortcut uses "bot" for bot-identity mode, not "tenant".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@shortcuts/mail/mail_lint_html.go` at line 22, Update the AuthTypes slice used
in the shortcut registration: replace the invalid identity "tenant" with the
supported "bot" so AuthTypes reads {"user","bot"}; locate the AuthTypes field in
mail_lint_html.go (the struct literal where AuthTypes is set) and change the
value to use "bot" to match the supported identities defined in
shortcuts/common/types.go and be consistent with other shortcuts.
|
|
Generated by the harness-coding skill.
Sprints
Notes
Registry metadata change was committed locally but could not be pushed because code.byted.org returned 403 for lark/larksuite-cli-registry.
This MR was created autonomously. Quality gates were limited to repository commit hooks per harness coding scope.
Summary by CodeRabbit
New Features
+lint-htmlcommand to check email HTML compatibility and auto-fix issues.LARK_CLI_MAIL_LINT_MODE=warn-onlyto report lint issues without blocking edits.Documentation