EPUB export of AZW3/MOBI: resolve TOC fragments and write embedded font assets#23
Open
Imaclean74 wants to merge 1 commit into
Open
Conversation
Two narrowly-scoped improvements to \`EpubExporter\`: ## TOC fragment resolution AZW3 and MOBI importers leave TOC entries with bare chapter hrefs (\`partNNNN.html\` or \`content.html\`) at open time — the \`#fileposN\` / \`#aid-XXXX\` fragment is populated only when \`Book::resolve_toc\` is called. Previously, the EPUB exporter generated the NCX before calling that, so every TOC entry within a single chapter collapsed onto the same \`partNNNN.html\` href and readers landed on chapter starts instead of the intended in-chapter target. Both \`export_raw\` and \`export_normalized\` now call \`book.resolve_toc()\` before generating the NCX. EPUB importers don't need this (their TOC is already resolved by the importer), so the call is a no-op for that backend. ## Font writing in normalized export \`export_normalized\` writes assets from \`NormalizedContent::assets\`, which only contains resources referenced from the IR DOM. Embedded fonts are typically referenced from CSS \`@font-face\` rules rather than DOM nodes, so they never made it into the normalized asset list and the exported EPUB shipped \`@font-face\` declarations whose \`src:\` URLs pointed at files that were never written into the ZIP. The exporter now snapshots \`book.list_assets()\` before normalization and, after writing the normalized assets, additionally enumerates every \`fonts/\*\` path that wasn't already covered — adding both an OPF manifest entry and a ZIP entry for each. \`export_raw\` is unaffected here because it already writes the full \`book.list_assets()\` set verbatim. ## Tests \`tests/epub_exports_fonts_and_toc.rs\`: - \`epub_export_writes_font_assets_from_kfx\` uses the existing \`tests/fixtures/fonts_only.kfx.gz\` fixture (the one added by zacharydenton#13) to confirm that the three KFX font assets the importer surfaces are written into the exported EPUB's \`OEBPS/fonts/\` directory and referenced in the OPF manifest. - \`epub_export_resolves_azw3_toc_fragments\` uses the existing \`tests/fixtures/epictetus.azw3\` fixture to confirm that the generated \`toc.ncx\` contains resolved \`#aid-XXXX\` fragments rather than the bare chapter hrefs the importer initially produced. Both tests fail without this patch's changes; both pass with them. ## Related work zacharydenton#13 surfaced KFX fonts; this PR makes them survive the EPUB export. For the AZW3 / MOBI side, font extraction is being added in a separate PR — the font-writing logic here is format-agnostic and benefits any importer that surfaces \`fonts/\*\` paths.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #22.
Summary
Two narrowly-scoped fixes to
EpubExporterthat turn currently-brokenoutput into correct EPUBs for AZW3 / MOBI books with fine-grained NCX
indexes and for any book with embedded fonts.
TOC fragment resolution (AZW3 / MOBI)
AZW3 / MOBI importers leave TOC entries with bare chapter hrefs at open
time; the
#fileposN/#aid-XXXXfragment is filled in byBook::resolve_toc. The exporter currently generates the NCX beforecalling that, so every TOC entry within a single source chapter
collapsed onto the same
partNNNN.htmlhref.Both
export_rawandexport_normalizednow callbook.resolve_toc()before generating the NCX. EPUB importers don't need this (their TOC is
already resolved by the importer), so the call is a no-op for that
backend.
Font writing in normalized export
export_normalizedwrites assets fromNormalizedContent::assets,which only contains resources referenced from the IR DOM. Embedded
fonts are typically referenced from CSS
@font-facerules and nevermake it into the normalized asset list. The exporter now snapshots
book.list_assets()before normalization and, after writing thenormalized assets, enumerates every
fonts/*path that wasn't alreadycovered — adding both an OPF manifest entry and a ZIP entry for each.
export_rawis unaffected here because it already writes the fullbook.list_assets()set verbatim.Related work
#13 surfaced KFX fonts via
bcRawFontentity discovery; this PR makesthem survive the EPUB export. For AZW3 / MOBI font extraction, a
separate PR (#21) plumbs the Kindle
FONTcontainer decoder through tolist_assets. The font-writing logic here is format-agnostic andbenefits any importer that surfaces
fonts/*paths.Changes
src/export/epub.rs:export_raw: callbook.resolve_toc()before NCX generation.export_normalized: callbook.resolve_toc()before NCXgeneration; snapshot
book.list_assets()and, after writing thenormalized-content assets, emit OPF manifest items + ZIP entries
for any
fonts/*path not already incontent.assets.Tests
tests/epub_exports_fonts_and_toc.rs:epub_export_writes_font_assets_from_kfx— uses the existingtests/fixtures/fonts_only.kfx.gzfixture, exports to EPUB, assertsthe three KFX font assets are written into
OEBPS/fonts/andreferenced in the OPF manifest.
epub_export_resolves_azw3_toc_fragments— uses the existingtests/fixtures/epictetus.azw3fixture, exports to EPUB, assertsthe generated
toc.ncxcarries resolved#aid-XXXXfragmentsrather than bare chapter hrefs.
Both tests fail when reverting this patch and pass with it applied —
verified locally by
git stash-ing the exporter changes and re-running.Verification
```
cargo fmt -- --check
cargo clippy --lib --tests
cargo test --lib # 548 passed
cargo test --test epub_exports_fonts_and_toc # 2 passed
```