Release v1.1.0 — PDF/A Latin embedding, BiDi isolates, Arabic harakat, emoji#40
Merged
Conversation
Carries forward alpha.1 work: watermark auto-fit, Unicode ellipsis, ColumnDef min/max, ASCII85/LZW/ASCIIHex/RunLength decoders, live version widget. 1665/1665 tests, ready for v1.1.0 stable expansion.
Phase 5 - Cell clipping (ISO 32000-1 §8.5.4):
- TableBlock.clipCells (default true) wraps each cell in 'q <rect> re W n ... Q'
- TableBlock.autoFitColumns option scaffolded (Phase 4 wiring next)
- 2 new tests in pdf-document.test.ts (1667/1667 total)
Phase 7 - UX docs polish:
- Added pdfnative-mcp badge in hero badge strip
- New compact version strip mounted directly under <nav> (.pn-version-strip)
- versions.js refactored to dual-mode (compact|detailed) supporting multiple mounts
- Strip propagated to docs/playgrounds/{cli,mcp}.html
Phase 4 - Auto-fit column widths: - src/core/pdf-column-fit.ts (NEW): content-derived f-fraction computation - TableBlock.autoFitColumns option (opt-in, default false) - Wired into renderTable via computeAutoFitColumns() - 7 new tests in tests/core/pdf-column-fit.test.ts Total: 1674/1674 tests green (alpha.1 baseline 1665 + 9 new) Version bump: 1.1.0-alpha.1 -> 1.1.0-alpha.2 Deferred to v1.1.0 stable: PDF/A Latin embedding (#28), full UAX #9 (#25), emoji Deferred to v1.2.0: true page-by-page constant-memory streaming
Bake NotoSans-VF.ttf (OFL-1.1) as fonts/noto-sans-data.js (4515 glyphs,
3094 cmap entries, unitsPerEm=1000) so consumers can embed a PDF/A-conforming
Latin font through the existing fontEntries pipeline:
import * as notoSans from 'pdfnative/fonts/noto-sans-data.js';
buildDocumentPDF({ ..., fontEntries: [{ fontRef: '/F1', fontData: notoSans }] },
{ tagged: 'pdfa2b' });
Closes #28. Adds 4 tests (1678 total). No public API change.
…B/GPOS drivers (#25) Phase 2 of v1.1.0. Three coordinated improvements to text shaping: 1. UAX #9 isolate support (LRI U+2066, RLI U+2067, FSI U+2068, PDI U+2069) in src/shaping/bidi.ts. Inner content of matched isolate pairs is resolved as a sealed sub-paragraph with its own forced (LRI/RLI) or auto-detected (FSI) paragraph level; nested isolates recurse. Texts without isolate codepoints behave byte-identically to v1.0. 2. Arabic GPOS MarkBasePos: harakat (U+064B-U+0652) and other transparent marks are now anchored on the preceding base glyph using the GPOS anchor data already extracted by tools/build-font-data.cjs. Falls back to (0,0) when the font lacks anchors, preserving v1.0 behaviour. 3. New shared modules src/shaping/gsub-driver.ts and gpos-positioner.ts centralise tryLigature() (was duplicated 3x in Bengali/Tamil/Devanagari) and getBaseAnchor/getMarkAnchor/positionMarkOnBase (was duplicated in Devanagari and missing from Arabic). Pure refactor, zero behaviour change for the Indic shapers. Adds 24 new tests (1702 total), full typecheck + lint clean. Closes the BiDi/GPOS portion of #25; remaining work (full embeddings LRE/RLE/LRO/RLO and embedding levels > 2) deferred to v1.2.
Phase 3 of v1.1.0. Adds first-class emoji rendering through the existing
multi-font fallback pipeline:
- Bundle fonts/noto-emoji-data.js (Noto Emoji OFL-1.1, 1891 glyphs,
1489 cmap entries, 2.6 MB module)
- Add EMOJI_RANGES + isEmojiCodepoint() + containsEmoji() in
src/shaping/script-registry.ts; export ZWJ / VS-15 / VS-16 / Fitzpatrick
range constants
- Update src/shaping/script-detect.ts: detectCharLang() returns 'emoji',
needsUnicodeFont('emoji') === true, detectFallbackLangs() picks up
emoji codepoints
- Wire NotoEmoji-Regular.ttf into scripts/download-fonts.ts manifest
Usage:
registerFont('emoji', () => import('pdfnative/fonts/noto-emoji-data.js'));
Adds 15 tests (1717 total), full typecheck + lint clean.
- Bump package.json version 1.1.0-alpha.2 -> 1.1.0 stable - Add 'emoji' to package.json keywords - New release-notes/v1.1.0.md (full feature list, upgrade notes, deferred items) - CHANGELOG: new [1.1.0] section above alpha sections - ROADMAP: move #28, #25, emoji, auto-fit, clipping, UX docs to Released - ROADMAP: new 'Planned v1.2.0' section (streaming, embeddings, COLRv1, USE-lite) - README: register 'latin' / 'emoji' fonts in example + supported languages table - copilot-instructions: shared GSUB/GPOS drivers, BiDi isolates, emoji, Latin VF - 1717 / 1717 tests green across 48 files; build + lint + typecheck:all clean
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Maximalist minor release. Closes the two largest open epics — issue
#28 (PDF/A Latin font
embedding) and issue #25
(full UAX #9 BiDi isolates + GPOS MarkBasePos for Arabic harakat) — and
adds first-class monochrome emoji support, auto-fit table columns, and
per-cell clipping. Folds the alpha.1 / alpha.2 medium-term items into a
single stable cut.
100% backward-compatible. All new features are opt-in and gated on
font registration or explicit table flags. Pre-existing PDFs are
byte-identical when the new font modules are not registered.
.d.ts, tree-shakeablecontinue-on-errorpre-v1.0.5)pdfa-latin/andemoji/)What's new
PDF/A — full veraPDF conformance ✅
via Noto Sans VF (OFL-1.1). Opt-in with one line:
/F1//F2under PDF/A — Object 3and Object 4 now reference the embedded
CIDFontType2chain insteadof standard-14 Helvetica (ISO 19005-1 §6.3.4 / ISO 19005-2 §6.2.11.4.1).
<dc:title>,<dc:description>,<pdf:Keywords>mirror
/Info /Title,/Subject,/Keywordsbyte-for-byte(ISO 19005-1 §6.7.3 t1 / t4 / t5). Achieved via
utf8EncodeBinaryString()helper that preserves chars > U+00FFthrough
toBytes()'s& 0xFFmasking, andbuildXMPMetadata()with optionalsubject/keywordsparameters.
scripts/validate-pdfa.tsalreadyauto-detects PDF/A claims via XMP
pdfaid:part, so non-PDF/A samplesnever trigger CI failures.
BiDi & shaping (#25)
via three-tier dispatcher:
resolveBidiRuns→resolveBidiRunsForced→resolveBidiCore. Nested and unmatched isolates supported.kasra, damma, sukun, shadda, …) now anchor on the preceding base
glyph using font-provided GPOS anchor data. Tracks
lastBaseGidthrough the shaping pipeline including lam-alef ligatures.
src/shaping/gsub-driver.ts(
tryLigature) andsrc/shaping/gpos-positioner.ts(
positionMarkOnBase). Bengali / Tamil / Devanagari / Arabic shapersnow route through a single lookup helper instead of four duplicated
implementations.
Emoji
and VS-15 / VS-16. Multi-font run splitting routes emoji codepoints
to the registered
'emoji'font automatically.Tables
TableBlock.autoFitColumns: true— column widths derived frommeasured content. Honours per-column
minWidth/maxWidthclamping.TableBlock.clipCells: true(default) — every header and data cellwrapped in
q <rect> re W n … Qso variable-width content cannotescape its column rectangle.
Backward compatibility — no breaking changes
registerFont('latin', …),registerFont('emoji', …),TableBlock.autoFitColumns,TableBlock.clipCells,utf8EncodeBinaryStringare new exportsbuildPDFBytes'latin'/'emoji'font is registeredbuildDocumentPDF'latin'/'emoji'font is registeredbuildXMPMetadatasubject/keywordsparams default toundefined— pre-existing call sites unchangedcreateEncodingContextpdfAparam defaults tofalse— pre-existing call sites unchangedreadonly number[]); no narrowing changesMigration
No code changes required. To opt into stricter PDF/A or emoji rendering:
Validation
npm run typecheck:all— 3 tsconfigs, cleannpm run lint— clean (ESLint 9 + typescript-eslint strict)npm test— 1726 / 1726 green across 48 filesnpm run build— ESM + CJS + d.ts, tree-shakeablenpm run test:generate— 154 sample PDFs across 26 categoriesnpm run validate:pdfa— every PDF/A-claiming sample passes veraPDF(1b / 2b / 2u / 3b)
Documentation refreshed
FixedandNotessubsectionsFixedblockcontinue-on-errorremoved; veraPDF is now blockingIssues closed
Deferred to v1.2.0
require deeper level-stack refactor)
(
buildDocumentPDFStreamPageByPage())Credits
Checklist
npm run typecheck:allcleannpm run lintcleannpm run buildcleannpm run test:generateregenerates 154 PDFs without errors[1.1.0]release-notes/v1.1.0.mdCloses #28, closes #25