Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@ src/
└── worker/ # Web Worker dispatch + self-contained worker entry
fonts/ # Pre-built font data modules (.js/.d.ts) — 16 scripts + TTF source files
tools/ # CLI tool (build-font-data.cjs) for converting TTF → importable data modules
scripts/ # Modular sample PDF generation (24 generators, 140+ PDFs; extreme-shaping.ts added in v1.0.3)
test-output/extreme/ # Visual regression baselines for extreme scripts (extreme-bidi.pdf, extreme-tamil.pdf, extreme-bengali-devanagari.pdf, extreme-arabic-harakat.pdf)
tests/ # 1606+ tests (41 files: unit/integration/fuzz/parser) mirroring src/ structure
scripts/ # Modular sample PDF generation (26 generators, 150+ PDFs; emoji-showcase.ts and pdfa-latin-embedding.ts added in v1.1.0)
test-output/extreme/ # Visual regression baselines for extreme scripts (extreme-bidi.pdf, extreme-tamil.pdf, extreme-bengali-devanagari.pdf, extreme-arabic-harakat.pdf, extreme-bidi-isolates.pdf)
tests/ # 1726+ tests (48 files: unit/integration/fuzz/parser) mirroring src/ structure
bench/ # Performance benchmarks (vitest bench)
docs/ # GitHub Pages landing site (pdfnative.dev) — pure HTML/CSS/JS, zero build deps
└── playgrounds/ # Interactive browser playgrounds (extreme-scripts.html, medical-800.html)
Expand Down Expand Up @@ -78,7 +78,7 @@ npm run build # tsup → dist/ (ESM + CJS + .d.ts)
npm run test # vitest run (1588+ tests, 40 files)
npm run test:watch # vitest (watch mode)
npm run test:coverage # vitest with v8 coverage (thresholds: 90/80/85/90)
npm run test:generate # Generate 140+ sample PDFs → test-output/ (incl. extreme/ baselines)
npm run test:generate # Generate 150+ sample PDFs → test-output/ (incl. extreme/, emoji/, pdfa-latin/ baselines)
npm run typecheck # tsc --noEmit
npm run typecheck:tests # tsc --project tsconfig.test.json --noEmit
npm run typecheck:scripts # tsc --project tsconfig.scripts.json --noEmit
Expand All @@ -90,7 +90,7 @@ npm run lint # eslint src/ (ESLint 9 + typescript-eslint strict)
- Test runner: **vitest** (fast, native ESM, watch mode, v8 coverage)
- CI: GitHub Actions — lint/typecheck/test/build on Node 22/24
- Publish: GitHub Actions OIDC with `npm publish --provenance`
- All new code must have tests. Current: ~95% statement coverage, 1588+ tests (40 files)
- All new code must have tests. Current: ~95% statement coverage, 1726+ tests (48 files)

## Conventions

Expand Down Expand Up @@ -133,6 +133,7 @@ npm run lint # eslint src/ (ESLint 9 + typescript-eslint strict)
- URL validation: only `http:`, `https:`, `mailto:` schemes allowed; `javascript:`, `file:`, `data:` blocked; control characters (U+0000–U+001F, U+007F–U+009F) rejected
- Color safety: `parseColor()` validates/normalizes hex, tuple, PDF string → safe `"R G B"` output; `normalizeColors()` at layout boundary
- Color types: `PdfColor = PdfRgbString | PdfRgbTuple | (string & {})` — union preserves autocomplete for template literals
- BiDi: UAX #9 isolates (LRI U+2066 / RLI U+2067 / FSI U+2068 / PDI U+2069) classified as `BN` and recursed via three-tier dispatcher: public `resolveBidiRuns(text)` finds outermost isolate pairs, internal `resolveBidiRunsForced(text, forcedLevel)` recurses, internal `resolveBidiCore(text, codePoints, cpToStr, forcedLevel?)` runs the W1–W7 / N1–N2 / L2 pipeline. Embeddings (LRE/RLE/LRO/RLO/PDF) deferred to v1.2.
- BiDi: simplified UAX #9 — paragraph level detection, weak/neutral type resolution, level assignment, L2 paragraph-level run reordering
- BiDi: General Punctuation (U+2010–U+2027, U+2030–U+205E) classified as ON — covers dashes, quotes, ellipsis, primes
- BiDi: `resolveBidiRuns()` returns runs in visual order — for RTL paragraphs (paraLevel=1), runs are reversed so LTR text comes first (leftmost) and RTL text last (rightmost)
Expand Down Expand Up @@ -197,7 +198,10 @@ npm run lint # eslint src/ (ESLint 9 + typescript-eslint strict)
- Bengali shaping: `shapeBengaliText()` — GSUB conjunct formation + GPOS mark positioning via `bengali-shaper.ts`
- Tamil shaping: `shapeTamilText()` — GSUB substitution + split vowel decomposition via `tamil-shaper.ts`
- Devanagari shaping: `shapeDevanagariText()` — cluster building, reph detection, matra reordering, split vowels, GSUB ligature conjuncts, GPOS mark positioning via `devanagari-shaper.ts`
- GSUB LookupType 4 (LigatureSubst): `fontData.ligatures` — `Record<number, number[][]>` mapping first-glyph GID → arrays of `[resultGID, ...componentGIDs]`; `tryLigature()` pattern used by Bengali, Tamil, and Devanagari shapers
- GSUB LookupType 4 (LigatureSubst): `fontData.ligatures` — `Record<number, number[][]>` mapping first-glyph GID → arrays of `[resultGID, ...componentsAfterKey]` (the first GID is the implicit lookup key, NOT included in the components array). Shared `tryLigature(gids, ligatures)` lives in `src/shaping/gsub-driver.ts` and is used by Bengali, Tamil, Devanagari, and Arabic shapers. Each shaper exposes a thin `tryLig(gids)` closure that forwards to the shared driver.
- GPOS MarkBasePos: shared helpers in `src/shaping/gpos-positioner.ts` (`getBaseAnchor`, `getMarkAnchor`, `getMark2MarkAnchor`, `positionMarkOnBase(markAnchors, markGid, baseGid, baseAdv)`). Used by Devanagari and Arabic shapers. Arabic tracks `lastBaseGid` through the shaping pipeline (including lam-alef ligatures) and applies the anchor offset to transparent (joining type 'T') marks; falls back to (0, 0) when font lacks anchors.
- Emoji: monochrome via Noto Emoji (OFL-1.1) under lang `'emoji'`. Detection in `src/shaping/script-registry.ts` (`EMOJI_RANGES`, `isEmojiCodepoint`, `containsEmoji`, `FITZPATRICK_START/END`, `ZWJ`, `VS15`, `VS16`). `detectCharLang(cp)` returns `'emoji'` for emoji codepoints; `splitTextByFont()` routes them to the registered `'emoji'` font automatically. Opt-in via `registerFont('emoji', () => import('pdfnative/fonts/noto-emoji-data.js'))`. COLRv1 colour emoji deferred to v1.2.
- Latin VF (PDF/A): Noto Sans VF (OFL-1.1) bundled as `fonts/noto-sans-data.{js,d.ts}` under lang `'latin'`. Activates automatically for PDF/A documents containing non-WinAnsi Latin (curly quotes, em-dash, ellipsis…). Opt-in via `registerFont('latin', () => import('pdfnative/fonts/noto-sans-data.js'))`.

### API Design

Expand Down Expand Up @@ -231,7 +235,7 @@ npm run lint # eslint src/ (ESLint 9 + typescript-eslint strict)
- **PDF /Info metadata** — Title, Producer (pdfnative), CreationDate in D:YYYYMMDDHHmmss format
- **Input validation** — at `buildPDF()` boundary: null/undefined/type checks, 100K row limit
- **URL validation** — at `validateURL()`: blocks javascript:, file:, data: schemes
- **95%+ test coverage** — 1606+ tests (41 files), 48 fuzz edge-cases (including recursion/zip-bomb/xref-chain hardening), performance benchmarks
- **95%+ test coverage** — 1726+ tests (48 files), 48 fuzz edge-cases (including recursion/zip-bomb/xref-chain hardening), performance benchmarks
- **NPM provenance** — signed builds via GitHub Actions OIDC
- Security: no `eval()`, no `Function()`, no dynamic code execution
- No `console.log` in library code (only in tools/ and scripts/)
10 changes: 5 additions & 5 deletions .github/workflows/verapdf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,9 @@ jobs:
- name: Generate sample PDFs
run: npm run test:generate

# Pre-release mode for v1.0.4: keep veraPDF visible in CI but do not block merges yet.
# Reason: full PDF/A conformance is completed in v1.0.5 (Latin font embedding work).
# Switch this back to blocking in v1.0.5 by removing `continue-on-error`.
- name: Validate PDF/A claims (non-blocking pre-v1.0.5)
continue-on-error: true
# v1.1.0+: veraPDF is blocking. Every PDF/A-claiming sample (detected
# automatically via pdfaid:part in XMP) must validate against the
# corresponding PDF/A profile (1b, 2b, 2u, 3b). Non-PDF/A files are
# skipped by scripts/validate-pdfa.ts and never trigger failures.
- name: Validate PDF/A claims (blocking)
run: npm run validate:pdfa
Loading
Loading