-
Notifications
You must be signed in to change notification settings - Fork 1
Accessibility
modern-pdf-lib v0.15.1 — Tagged PDF, structure trees, and PDF/UA compliance.
Tagged PDF (ISO 32000-1 §14.7) is the foundation of accessible PDFs. It connects visual content to a logical structure tree that screen readers and assistive technology traverse. modern-pdf-lib provides a complete implementation covering:
- The
PdfStructureTree/PdfStructureElementmodel - Marked-content operators (
BDC,BMC,EMC) that link page content to the tree - Artifact marking for decorative content that should be ignored by assistive technology
- A multi-check accessibility validator (
checkAccessibility) - A quick pass/fail guard (
isAccessible)
import { createPdf } from 'modern-pdf-lib';
import {
checkAccessibility,
isAccessible,
summarizeIssues,
} from 'modern-pdf-lib';
import {
PdfStructureTree,
PdfStructureElement,
} from 'modern-pdf-lib';
import {
beginMarkedContentSequence,
endMarkedContent,
beginArtifact,
endArtifact,
wrapInMarkedContent,
createMarkedContentScope,
} from 'modern-pdf-lib';The root container for a tagged PDF's logical structure. Created via doc.createStructureTree() or instantiated directly.
const tree = doc.createStructureTree();
// Add a top-level heading
const h1 = tree.addElement(null, 'H1', {
altText: 'Introduction',
language: 'en-US',
});
// Assign a marked-content ID linking it to page 0
const mcid = tree.assignMcid(h1, 0); // returns 0
// Add a paragraph under the heading
const para = tree.addElement(h1, 'P');
tree.assignMcid(para, 0); // returns 1| Method | Description |
|---|---|
addElement(parent, type, options?) |
Add a child element. Pass null for parent to add under the root Document. Returns the new PdfStructureElement. |
removeElement(element) |
Remove an element from the tree (cannot remove root). |
assignMcid(element, pageIndex) |
Assign the next available MCID and associate it with a page index. Returns the MCID number. |
getAllElements() |
Depth-first traversal of the entire tree. |
getNextMcid() |
Returns the current MCID counter (useful for testing). |
validate() |
Run built-in structural checks. Returns AccessibilityIssue[]. |
toDict(registry, pageRefs) |
Serialize to a /StructTreeRoot dictionary. |
PdfStructureTree.fromDict(dict, resolver) |
Parse a structure tree from an existing PDF dictionary. |
A single node in the structure tree.
const section = tree.addElement(null, 'Sect');
const figure = section.addChild('Figure', {
altText: 'Company logo',
id: 'fig-logo',
});
tree.assignMcid(figure, 0);
console.log(figure.depth()); // 2 (Document > Sect > Figure)
console.log(figure.type); // 'Figure'
console.log(figure.options.altText); // 'Company logo'| Property / Method | Type | Description |
|---|---|---|
type |
StructureType |
The element's structure type. |
children |
PdfStructureElement[] |
Direct child elements. |
options |
StructureElementOptions |
Alt text, language, title, actual text, id. |
mcid |
number | undefined |
Marked-content ID assigned by assignMcid. |
pageIndex |
number | undefined |
Zero-based page the content appears on. |
parent |
PdfStructureElement | undefined |
Parent element. |
addChild(type, options?) |
PdfStructureElement |
Create and attach a child node. |
removeChild(element) |
void |
Detach a direct child. |
walk() |
PdfStructureElement[] |
Depth-first subtree including self. |
find(type) |
PdfStructureElement | undefined |
First matching descendant. |
findAll(type) |
PdfStructureElement[] |
All matching descendants. |
depth() |
number |
Distance from the root (root = 0). |
All standard ISO 32000-1 structure types are accepted as the StructureType string union. Custom types are also allowed.
| Category | Types |
|---|---|
| Grouping |
Document, Part, Art, Sect, Div, BlockQuote, Caption, TOC, TOCI, Index, NonStruct, Private
|
| Block-level |
P, H, H1, H2, H3, H4, H5, H6
|
| List |
L, LI, Lbl, LBody
|
| Table |
Table, TR, TH, TD, THead, TBody, TFoot
|
| Inline |
Span, Quote, Note, Reference, BibEntry, Code, Link, Annot
|
| Ruby / Warichu |
Ruby, RB, RT, RP, Warichu, WT, WP
|
| Illustration |
Figure, Formula, Form
|
interface StructureElementOptions {
title?: string; // Human-readable label
altText?: string; // Required for Figure, Formula, Form (PDF/UA)
actualText?: string; // Replacement text override
language?: string; // BCP 47 tag, e.g. "en-US"
id?: string; // Unique element identifier
}Marked-content operators appear in the page content stream and reference the structure tree via MCID numbers.
// Begin a marked-content sequence for a paragraph (MCID 3)
const bdc = beginMarkedContentSequence('P', 3);
// produces: /P <</MCID 3>> BDC\n
const emc = endMarkedContent();
// produces: EMC\n
// Convenience wrapper
const wrapped = wrapInMarkedContent(myOperators, 'Span', 5);const scope = createMarkedContentScope('H1', 0);
page.pushOperators(scope.begin());
page.drawText('Introduction', { x: 72, y: 700, size: 24 });
page.pushOperators(scope.end());Artifacts are page elements that carry no document meaning (page numbers, decorative rules, headers, footers). Assistive technology must skip them.
// Simple artifact
page.pushOperators(beginArtifact());
page.drawText('Page 1', { x: 250, y: 20, size: 10 });
page.pushOperators(endArtifact());
// Typed artifact with subtype
import { beginArtifactWithType } from 'modern-pdf-lib';
page.pushOperators(beginArtifactWithType('Pagination', 'Footer'));
page.drawText('Confidential', { x: 72, y: 20, size: 8 });
page.pushOperators(endArtifact());| Function | Produces | Description |
|---|---|---|
beginMarkedContent(tag) |
/<tag> BMC\n |
Simple tag with no properties. |
beginMarkedContentSequence(tag, mcid) |
/<tag> <</MCID n>> BDC\n |
Tag with MCID linking to structure tree. |
beginMarkedContentWithProperties(tag, props) |
/<tag> <<...>> BDC\n |
Tag with arbitrary inline properties. |
endMarkedContent() |
EMC\n |
Closes any open marked-content scope. |
wrapInMarkedContent(ops, tag, mcid) |
BDC + ops + EMC | Wraps existing operator string. |
createMarkedContentScope(tag, mcid) |
MarkedContentScope |
Returns { begin(), end(), mcid, tag } object. |
beginArtifact() |
/Artifact BMC\n |
Marks decorative content. |
beginArtifactWithType(type, subtype?) |
/Artifact <<...>> BDC\n |
Typed artifact (Pagination, Layout, Background). |
endArtifact() |
EMC\n |
Alias for endMarkedContent(). |
import { createPdf } from 'modern-pdf-lib';
import {
beginMarkedContentSequence,
endMarkedContent,
beginArtifact,
endArtifact,
} from 'modern-pdf-lib';
const doc = createPdf();
doc.setTitle('Accessible Report');
doc.setLanguage('en-US');
const page = doc.addPage([595, 842]);
const tree = doc.createStructureTree();
// --- Heading ---
const h1 = tree.addElement(null, 'H1');
const h1Mcid = tree.assignMcid(h1, 0);
page.pushOperators(beginMarkedContentSequence('H1', h1Mcid));
page.drawText('Quarterly Report', { x: 72, y: 750, size: 28 });
page.pushOperators(endMarkedContent());
// --- Body paragraph ---
const para = tree.addElement(null, 'P');
const paraMcid = tree.assignMcid(para, 0);
page.pushOperators(beginMarkedContentSequence('P', paraMcid));
page.drawText('This report summarises Q4 results.', { x: 72, y: 700, size: 12 });
page.pushOperators(endMarkedContent());
// --- Figure ---
const fig = tree.addElement(null, 'Figure', { altText: 'Revenue bar chart' });
const figMcid = tree.assignMcid(fig, 0);
page.pushOperators(beginMarkedContentSequence('Figure', figMcid));
// ... draw chart ...
page.pushOperators(endMarkedContent());
// --- Footer (artifact) ---
page.pushOperators(beginArtifact());
page.drawText('Page 1 of 1', { x: 250, y: 20, size: 9 });
page.pushOperators(endArtifact());
const pdfBytes = await doc.save();Runs a suite of PDF/UA (ISO 14289-1) checks against a PdfDocument.
import { checkAccessibility, summarizeIssues, isAccessible } from 'modern-pdf-lib';
const issues = checkAccessibility(doc);
// Summarise by severity
const { errors, warnings, infos, total } = summarizeIssues(issues);
console.log(`${errors} errors, ${warnings} warnings, ${infos} infos`);
// Quick boolean check (true = no errors)
if (!isAccessible(issues)) {
for (const issue of issues.filter(i => i.severity === 'error')) {
console.error(`[${issue.code}] ${issue.message}`);
}
}| Check | Code | Severity | Description |
|---|---|---|---|
| Structure tree present | NO_STRUCT_TREE |
error |
/StructTreeRoot required by PDF/UA |
| Document language | NO_LANG |
error |
/Lang required for screen readers |
| Language format | INVALID_LANG |
warning | Must be a valid BCP 47 tag |
| Document title | NO_TITLE |
warning | Descriptive title improves usability |
| Tagged content | NO_TAGGED_CONTENT |
warning | Structure tree exists but no MCIDs assigned |
| Untagged page | UNTAGGED_PAGE |
info | Page has no tagged elements |
| Reading order | READING_ORDER |
info | MCIDs not in ascending order on a page |
| Heading hierarchy | HEADING_SKIP |
warning | Heading levels must not skip (e.g. H1 → H3) |
| First heading | HEADING_START |
warning | First heading should be H1 |
| Table rows | TABLE_NO_ROWS |
error | Table has no TR children |
| Table headers | TABLE_NO_HEADERS |
warning | Table has no TH cells |
| TR cells | TR_NO_CELLS |
error | TR has no TH or TD children |
| Cell placement | CELL_NOT_IN_TR |
error | TH/TD must be direct children of TR |
| Figure alt text | FIGURE_NO_ALT |
error | Figure / Formula / Form requires alt text |
| List items | LIST_NO_ITEMS |
error | L (list) has no LI children |
| LI body | LI_NO_BODY |
warning | LI should contain an LBody child |
| Empty document | NO_PAGES |
info | Document has no pages |
interface AccessibilityIssue {
severity: 'error' | 'warning' | 'info';
code: string; // e.g. 'FIGURE_NO_ALT'
message: string;
element?: PdfStructureElement; // Related structure element
pageIndex?: number; // Zero-based page index
}PDF/UA-1 (ISO 14289-1) imposes additional requirements beyond a structure tree. The checks listed above map directly to PDF/UA rules. For full PDF/UA compliance you should also:
- Set the document language —
doc.setLanguage('en-US') - Set a meaningful title —
doc.setTitle('...') - Ensure every
Figure,Formula, andFormelement hasaltText - Maintain heading hierarchy (no skipped levels)
- Mark all non-structural content as artifacts
- Embed all fonts (no standard 14 shortcuts)
- Set
Marked: truein the document's/MarkInfodictionary (done automatically bycreateStructureTree)
See also: PDF-A Compliance for archival requirements that often accompany PDF/UA.