Skip to content

WASM Acceleration

ABCrimson edited this page Mar 1, 2026 · 5 revisions

WASM Acceleration

modern-pdf-lib v0.15.1

All heavy computation in modern-pdf-lib has a pure-JavaScript fallback that produces bit-identical output to the WASM path. WASM acceleration is strictly optional and is never required for correctness — it only affects performance.


Overview

Module Rust crate Purpose Typical speedup
libdeflate libdeflate-rs Zlib/deflate stream compression for PDF streams ~2×
png png (image crate) PNG image decoding (filter reconstruction + deinterlace) ~5×
ttf ttf-parser Font metric extraction (cmap, hmtx, head, hhea) ~3×
shaping rustybuzz Full OpenType text shaping (ligatures, RTL, mark positioning) ~10×
jbig2 jbig2-rs JBIG2 bilevel image decoding ~3×
jpeg jpeg-encoder + jpeg-decoder JPEG encoding/decoding for image optimization ~4×

All six modules are compiled with wasm-pack and expose their exports via wasm-bindgen. The generated .wasm binaries are found at:

src/wasm/libdeflate/pkg/modern_pdf_deflate_bg.wasm
src/wasm/png/pkg/modern_pdf_png_bg.wasm
src/wasm/ttf/pkg/modern_pdf_ttf_bg.wasm
src/wasm/shaping/pkg/modern_pdf_shaping_bg.wasm
src/wasm/jbig2/pkg/modern_pdf_jbig2_bg.wasm
src/wasm/jpeg/pkg/modern_pdf_jpeg_bg.wasm

Module details

libdeflate

The JS fallback uses fflate for deflate compression. The WASM module wraps the libdeflate C library (via the libdeflate-rs Rust crate) for substantially faster compression of large content streams and font programs.

Activated by passing deflate: true to initWasm().

png

PNG decoding in JavaScript requires reconstructing filtered scanlines and potentially deinterlacing Adam7 interlaced images. The WASM module uses Rust's image crate (png feature) which executes native SIMD-friendly code compiled to WASM. The speedup is most noticeable for large images or batches of many images.

Activated by passing png: true to initWasm().

ttf

The font metric extractor reads the cmap, hmtx, head, hhea, and OS/2 tables from a raw TTF/OTF binary to produce a FontMetrics object. The WASM module (ttf-parser) parses these tables in Rust and returns a flat FontInfo struct whose fields are accessed through wasm-bindgen getters.

The FontInfo wire format uses flat Uint8Array buffers in little-endian layout to minimise crossing the JS/WASM boundary:

Field Format Description
glyph_widths u16[] LE, 2 bytes/glyph Horizontal advance per glyph
cmap_entries [u32 cp, u16 gid] LE, 6 bytes/entry Unicode codepoint → glyph ID

The pure-JS fallback (src/assets/font/fontMetrics.ts) parses the same tables and produces an identical FontMetrics object.

Activated by passing fonts: true to initWasm().

Note: The ttf WASM module is a parser only. Font subsetting is performed entirely in pure JS (src/assets/font/ttfSubset.ts). There is no WASM subsetter in the current release; the initSubsetWasm() function is a no-op placeholder reserved for a future dedicated subsetting module.

shaping

The shaping module wraps rustybuzz, a pure-Rust port of HarfBuzz. It performs full OpenType layout:

  • GSUB — glyph substitution (ligatures, contextual alternates, Arabic joining forms, Indic conjunct formation)
  • GPOS — glyph positioning (kerning, mark-to-base, mark-to-mark, cursive attachment)
  • Bidi — Unicode bidirectional algorithm for RTL scripts

The WASM exports two entry points:

// Auto-detects script/language from the font
shape_text(font_data: &[u8], text: &str, direction: u8) -> ShapingResult

// Explicit script + language tags
shape_text_with_features(
    font_data: &[u8], text: &str, direction: u8,
    script: &str, language: &str,
) -> ShapingResult

ShapingResult contains parallel flat Uint8Array buffers (LE encoded) that are decoded in src/assets/font/textShaper.ts:

Buffer Type Content
glyph_ids u16[] Output glyph IDs
x_advances i32[] Horizontal advance per glyph (design units)
y_advances i32[] Vertical advance (usually 0)
x_offsets i32[] Horizontal pen offset (kerning / mark)
y_offsets i32[] Vertical pen offset
clusters u32[] Source string byte index for each glyph

Activated by calling initShapingWasm() from src/assets/font/textShaper.ts.


initWasm() API

import { initWasm } from 'modern-pdf-lib';

// Load all six WASM modules
await initWasm({
  deflate: true,
  png:     true,
  fonts:   true,
  jpeg:    true,
});

initWasm() is idempotent — subsequent calls after the first successful initialization are no-ops.

Selective initialization

Load only the modules you need. Each module is lazily imported with a dynamic import(), so unused modules add zero bytes to your bundle:

// Minimal setup: only faster compression
await initWasm({ deflate: true });

// Minimal setup: only faster PNG decoding
await initWasm({ png: true });

Providing pre-loaded bytes

On runtimes without a filesystem (Cloudflare Workers, edge environments, bundled applications) the WASM bytes must be provided inline:

import deflateWasm from './modern_pdf_deflate_bg.wasm'; // bundler import
import { initWasm } from 'modern-pdf-lib';

await initWasm({
  deflate:     true,
  deflateWasm: new Uint8Array(deflateWasm),
});

For more control, use configureWasmLoader() directly:

import { configureWasmLoader } from 'modern-pdf-lib';

configureWasmLoader({
  moduleBytes: {
    libdeflate: myDeflateBytes,
    shaping:    myShaperBytes,
  },
});

Runtime support matrix

Runtime Auto-load fetch() Filesystem
Node.js >=25.7 fs/promises Yes Yes
Bun fetch() Yes Yes
Deno fetch() Yes Yes
Browser fetch() Yes No
Cloudflare Workers Must provide bytes No No
Unknown fetch() (best-effort) Depends No

Building the WASM modules

Requirements: Rust (stable toolchain), wasm-pack.

# Install wasm-pack
cargo install wasm-pack

# Build all six modules
npm run build:wasm

# Build a single module
cd src/wasm/shaping && wasm-pack build --target web --release

The build:wasm npm script runs wasm-pack build --target web --release in each subdirectory and copies the resulting .wasm files to the pkg/ directories.


Architecture: the _impl pattern

Each WASM-accelerated module exposes an *_impl variant of its core function for testing in a native Rust environment (without WASM overhead):

// In src/wasm/shaping/src/lib.rs
pub fn shape_text_impl(font: &[u8], text: &str, dir: u8) -> RawShapingResult { ... }

#[wasm_bindgen]
pub fn shape_text(font: &[u8], text: &str, dir: u8) -> ShapingResult {
    let raw = shape_text_impl(font, text, dir);
    ShapingResult::from(raw)
}

shape_text_impl can be unit-tested directly with cargo test at native speed, while the #[wasm_bindgen] wrapper is tested end-to-end via the Playwright integration suite.


Fallback guarantee

If a WASM module fails to load (network error, missing binary, unsupported runtime), the library silently falls back to the pure-JS implementation. No exception is thrown. The output is always correct; only performance differs.

Clone this wiki locally