✨ feat: add DOCX output via pandoc#33
Conversation
Add a -format flag (pdf|docx) and infer the format from the -o extension. For DOCX output, the assembled HTML is converted with the external pandoc CLI, reusing the existing Mermaid/image pipeline. PDF remains the default. Note: pandoc's HTML→DOCX path does not reliably embed Mermaid SVGs, so diagrams may be dropped in DOCX output. Entire-Checkpoint: 54049f687aa4
Word could not display the inline/standalone SVG that pandoc embeds from HTML, so Mermaid diagrams went missing. Render them to PNG via mmdc for DOCX output (PDF keeps inline SVG). Pandoc's default Table style has no borders, so GFM tables rendered as borderless text. Generate a reference document and patch the Table style with borders (best-effort, falls back to pandoc defaults). Entire-Checkpoint: ff0225ce25ea
|
Warning Review limit reached
More reviews will be available in 14 minutes and 51 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughAdds DOCX output format support to the md2pdf converter. Mermaid diagrams render to PNG files for DOCX (instead of inline SVG). HTML assembly conditionally emits ChangesDOCX Output Format
Sequence Diagram(s)sequenceDiagram
actor User
participant flags.go
participant Converter
participant renderMermaid
participant convertDOCX
participant pandoc
User->>flags.go: md2pdf -format docx input.md
flags.go->>flags.go: resolveFormat("-format docx", "")
flags.go->>Converter: Config{Format:"docx", PandocPath:...}
Converter->>renderMermaid: blocks, format="docx"
renderMermaid->>renderMermaid: renderSingleDiagramPNG (mmdc → .png)
renderMermaid-->>Converter: block.ImagePath set
Converter->>Converter: buildHTML (embeds img tags)
Converter->>convertDOCX: absHTML, absOut
convertDOCX->>pandoc: --print-default-data-file reference.docx
pandoc-->>convertDOCX: reference.docx bytes
convertDOCX->>convertDOCX: addTableBorders (patch word/styles.xml)
convertDOCX->>pandoc: -f html -o out.docx --reference-doc patched.docx
pandoc-->>convertDOCX: out.docx
convertDOCX-->>User: "DOCX saved to out.docx"
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add -format/-pandoc flags, pandoc dependency, and DOCX examples to the README and Hugo site (usage, getting-started, architecture, homepage). Entire-Checkpoint: 63056eabcc75
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
cmd/md2pdf/flags_test.go (1)
48-93: ⚡ Quick winAdd a
-pandocpassthrough case inparseFlagstests.The new CLI surface includes
-pandoc, but this test block doesn’t verifyConfig.PandocPathmapping yet.Suggested test addition
func TestParseFlags_DefaultOutputExtensionFollowsFormat(t *testing.T) { @@ t.Run("docx inferred from output extension", func(t *testing.T) { @@ }) + + t.Run("pandoc path passthrough", func(t *testing.T) { + cfg, err := parseFlags([]string{"-format", "docx", "-pandoc", "/custom/pandoc", input}) + if err != nil { + t.Fatalf("parseFlags: %v", err) + } + if cfg.PandocPath != "/custom/pandoc" { + t.Errorf("PandocPath = %q, want %q", cfg.PandocPath, "/custom/pandoc") + } + }) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/md2pdf/flags_test.go` around lines 48 - 93, Add a new subtest within TestParseFlags_DefaultOutputExtensionFollowsFormat to verify that the -pandoc flag is properly handled by parseFlags. Create a test case similar to the existing subtests that calls parseFlags with the -pandoc argument followed by a path value, then verify that the returned config's PandocPath field is set to the expected path value.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CLAUDE.md`:
- Line 46: In the CLAUDE.md architecture documentation, correct the description
of pandoc auto-detection ownership. The current text incorrectly attributes
pandoc auto-detection to flags.go, but the actual implementation shows that
findPandoc function in internal/converter/docx.go handles the auto-detection
logic. Update the line to clarify that flags.go handles argument parsing only,
while the actual pandoc path auto-detection is performed by the findPandoc
function in the docx converter module.
In `@internal/converter/mermaid.go`:
- Around line 61-63: To prevent filename collisions between Mermaid-generated
PNGs and user-supplied images, create a dedicated subdirectory for generated
diagrams. Modify the code where pngName is defined (line 62) to include a
subdirectory prefix like _md2pdf_mermaid, and ensure pngFile on line 63
constructs the full path within that subdirectory. Additionally, create the
subdirectory in the working directory before generating files, and ensure that
the pngName variable (or the returned path used later) includes the subdirectory
so that copyImages references the isolated location and cannot collide with user
images.
---
Nitpick comments:
In `@cmd/md2pdf/flags_test.go`:
- Around line 48-93: Add a new subtest within
TestParseFlags_DefaultOutputExtensionFollowsFormat to verify that the -pandoc
flag is properly handled by parseFlags. Create a test case similar to the
existing subtests that calls parseFlags with the -pandoc argument followed by a
path value, then verify that the returned config's PandocPath field is set to
the expected path value.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f8b2bc4b-96fa-49c7-b33a-ab1b89523ab9
📒 Files selected for processing (11)
CLAUDE.mdcmd/md2pdf/flags.gocmd/md2pdf/flags_test.gocmd/md2pdf/main.gointernal/converter/converter.gointernal/converter/docx.gointernal/converter/docx_test.gointernal/converter/html.gointernal/converter/html_test.gointernal/converter/mermaid.gointernal/converter/parser.go
Generated diagrams were written as diagram_N.png in the working directory, where copyImages could overwrite them with a same-named user image. Render them under a dedicated _md2pdf_mermaid/ subdir and skip that prefix in copyImages. Also add a -pandoc passthrough test and correct the CLAUDE.md note on where pandoc is resolved (docx.go, not flags.go). Entire-Checkpoint: 80a277ffcf1c
|
@coderabbitai review |
✅ Action performedReview finished.
|
Deduplicate the SVG and PNG diagram renderers into a single runMmdc helper, and name the mmdc background colour and PNG scale constants. Behavior-preserving: identical mmdc args, side-effect order, and error messages (verified by independent review). Entire-Checkpoint: 9c961b1218f1
概要
Markdown → PDF に加えて DOCX 出力 に対応しました。最終ステージを
Config.Formatで分岐させ、docx 指定時はビルド済み HTML を外部pandocCLI で変換します(既存の HTML 組み立て・画像コピー・Mermaid 処理を再利用)。PDF が引き続きデフォルトです。使い方
変更内容
DOCX 出力の追加 (
e402e8c)ConfigにFormat(pdf|docx) とPandocPathを追加-format/-pandocフラグ、resolveFormat(-formatまたは-o拡張子から判定)docx.go(新規):convertDOCX/findPandoc図・表のフォーマット改善 (
f746ac0)実ファイル変換で発覚した 2 つの不具合を修正:
<img>埋め込みに変更(PDF は従来通り SVG インライン)Tableスタイルに罫線が無いため、reference document を生成してTableスタイルに罫線を注入(OOXML 要素順序準拠・ベストエフォート、失敗時はデフォルトへフォールバック)検証
実ドキュメント(表 10 個・Mermaid 図 4 個)で確認:
テスト
findPandoc/convertDOCX/injectTableBorders/addTableBorders/buildHTML(img 注入)/resolveFormat/parseFlagsを追加gofmt・go vet合格備考
DOCX 出力には
pandocが必要です(CLAUDE.md の依存関係を更新済み)。Summary by CodeRabbit
Release Notes
New Features
-formatflag to explicitly select output format (pdf/docx)Tests
Documentation