Skip to content

Accessibility

ABCrimson edited this page Mar 1, 2026 · 5 revisions

Accessibility

modern-pdf-lib v0.15.1 — Tagged PDF, structure trees, and PDF/UA compliance.


Overview

Tagged PDF (ISO 32000-1 §14.7) is the foundation of accessible PDFs. It connects visual content to a logical structure tree that screen readers and assistive technology traverse. modern-pdf-lib provides a complete implementation covering:

  • The PdfStructureTree / PdfStructureElement model
  • Marked-content operators (BDC, BMC, EMC) that link page content to the tree
  • Artifact marking for decorative content that should be ignored by assistive technology
  • A multi-check accessibility validator (checkAccessibility)
  • A quick pass/fail guard (isAccessible)

Imports

import { createPdf } from 'modern-pdf-lib';
import {
  checkAccessibility,
  isAccessible,
  summarizeIssues,
} from 'modern-pdf-lib';
import {
  PdfStructureTree,
  PdfStructureElement,
} from 'modern-pdf-lib';
import {
  beginMarkedContentSequence,
  endMarkedContent,
  beginArtifact,
  endArtifact,
  wrapInMarkedContent,
  createMarkedContentScope,
} from 'modern-pdf-lib';

Structure Tree

PdfStructureTree

The root container for a tagged PDF's logical structure. Created via doc.createStructureTree() or instantiated directly.

const tree = doc.createStructureTree();

// Add a top-level heading
const h1 = tree.addElement(null, 'H1', {
  altText: 'Introduction',
  language: 'en-US',
});
// Assign a marked-content ID linking it to page 0
const mcid = tree.assignMcid(h1, 0);   // returns 0

// Add a paragraph under the heading
const para = tree.addElement(h1, 'P');
tree.assignMcid(para, 0);              // returns 1
Method Description
addElement(parent, type, options?) Add a child element. Pass null for parent to add under the root Document. Returns the new PdfStructureElement.
removeElement(element) Remove an element from the tree (cannot remove root).
assignMcid(element, pageIndex) Assign the next available MCID and associate it with a page index. Returns the MCID number.
getAllElements() Depth-first traversal of the entire tree.
getNextMcid() Returns the current MCID counter (useful for testing).
validate() Run built-in structural checks. Returns AccessibilityIssue[].
toDict(registry, pageRefs) Serialize to a /StructTreeRoot dictionary.
PdfStructureTree.fromDict(dict, resolver) Parse a structure tree from an existing PDF dictionary.

PdfStructureElement

A single node in the structure tree.

const section = tree.addElement(null, 'Sect');

const figure = section.addChild('Figure', {
  altText: 'Company logo',
  id: 'fig-logo',
});
tree.assignMcid(figure, 0);

console.log(figure.depth());      // 2  (Document > Sect > Figure)
console.log(figure.type);         // 'Figure'
console.log(figure.options.altText); // 'Company logo'
Property / Method Type Description
type StructureType The element's structure type.
children PdfStructureElement[] Direct child elements.
options StructureElementOptions Alt text, language, title, actual text, id.
mcid number | undefined Marked-content ID assigned by assignMcid.
pageIndex number | undefined Zero-based page the content appears on.
parent PdfStructureElement | undefined Parent element.
addChild(type, options?) PdfStructureElement Create and attach a child node.
removeChild(element) void Detach a direct child.
walk() PdfStructureElement[] Depth-first subtree including self.
find(type) PdfStructureElement | undefined First matching descendant.
findAll(type) PdfStructureElement[] All matching descendants.
depth() number Distance from the root (root = 0).

Structure Types

All standard ISO 32000-1 structure types are accepted as the StructureType string union. Custom types are also allowed.

Category Types
Grouping Document, Part, Art, Sect, Div, BlockQuote, Caption, TOC, TOCI, Index, NonStruct, Private
Block-level P, H, H1, H2, H3, H4, H5, H6
List L, LI, Lbl, LBody
Table Table, TR, TH, TD, THead, TBody, TFoot
Inline Span, Quote, Note, Reference, BibEntry, Code, Link, Annot
Ruby / Warichu Ruby, RB, RT, RP, Warichu, WT, WP
Illustration Figure, Formula, Form

StructureElementOptions

interface StructureElementOptions {
  title?:      string;   // Human-readable label
  altText?:    string;   // Required for Figure, Formula, Form (PDF/UA)
  actualText?: string;   // Replacement text override
  language?:   string;   // BCP 47 tag, e.g. "en-US"
  id?:         string;   // Unique element identifier
}

Marked-Content Operators

Marked-content operators appear in the page content stream and reference the structure tree via MCID numbers.

Tagging Structured Content

// Begin a marked-content sequence for a paragraph (MCID 3)
const bdc = beginMarkedContentSequence('P', 3);
// produces: /P <</MCID 3>> BDC\n

const emc = endMarkedContent();
// produces: EMC\n

// Convenience wrapper
const wrapped = wrapInMarkedContent(myOperators, 'Span', 5);

Using a Scope Object

const scope = createMarkedContentScope('H1', 0);

page.pushOperators(scope.begin());
page.drawText('Introduction', { x: 72, y: 700, size: 24 });
page.pushOperators(scope.end());

Marking Artifacts

Artifacts are page elements that carry no document meaning (page numbers, decorative rules, headers, footers). Assistive technology must skip them.

// Simple artifact
page.pushOperators(beginArtifact());
page.drawText('Page 1', { x: 250, y: 20, size: 10 });
page.pushOperators(endArtifact());

// Typed artifact with subtype
import { beginArtifactWithType } from 'modern-pdf-lib';
page.pushOperators(beginArtifactWithType('Pagination', 'Footer'));
page.drawText('Confidential', { x: 72, y: 20, size: 8 });
page.pushOperators(endArtifact());
Function Produces Description
beginMarkedContent(tag) /<tag> BMC\n Simple tag with no properties.
beginMarkedContentSequence(tag, mcid) /<tag> <</MCID n>> BDC\n Tag with MCID linking to structure tree.
beginMarkedContentWithProperties(tag, props) /<tag> <<...>> BDC\n Tag with arbitrary inline properties.
endMarkedContent() EMC\n Closes any open marked-content scope.
wrapInMarkedContent(ops, tag, mcid) BDC + ops + EMC Wraps existing operator string.
createMarkedContentScope(tag, mcid) MarkedContentScope Returns { begin(), end(), mcid, tag } object.
beginArtifact() /Artifact BMC\n Marks decorative content.
beginArtifactWithType(type, subtype?) /Artifact <<...>> BDC\n Typed artifact (Pagination, Layout, Background).
endArtifact() EMC\n Alias for endMarkedContent().

Complete Tagged-PDF Example

import { createPdf } from 'modern-pdf-lib';
import {
  beginMarkedContentSequence,
  endMarkedContent,
  beginArtifact,
  endArtifact,
} from 'modern-pdf-lib';

const doc = createPdf();
doc.setTitle('Accessible Report');
doc.setLanguage('en-US');

const page = doc.addPage([595, 842]);
const tree = doc.createStructureTree();

// --- Heading ---
const h1 = tree.addElement(null, 'H1');
const h1Mcid = tree.assignMcid(h1, 0);

page.pushOperators(beginMarkedContentSequence('H1', h1Mcid));
page.drawText('Quarterly Report', { x: 72, y: 750, size: 28 });
page.pushOperators(endMarkedContent());

// --- Body paragraph ---
const para = tree.addElement(null, 'P');
const paraMcid = tree.assignMcid(para, 0);

page.pushOperators(beginMarkedContentSequence('P', paraMcid));
page.drawText('This report summarises Q4 results.', { x: 72, y: 700, size: 12 });
page.pushOperators(endMarkedContent());

// --- Figure ---
const fig = tree.addElement(null, 'Figure', { altText: 'Revenue bar chart' });
const figMcid = tree.assignMcid(fig, 0);
page.pushOperators(beginMarkedContentSequence('Figure', figMcid));
// ... draw chart ...
page.pushOperators(endMarkedContent());

// --- Footer (artifact) ---
page.pushOperators(beginArtifact());
page.drawText('Page 1 of 1', { x: 250, y: 20, size: 9 });
page.pushOperators(endArtifact());

const pdfBytes = await doc.save();

Accessibility Checker

checkAccessibility

Runs a suite of PDF/UA (ISO 14289-1) checks against a PdfDocument.

import { checkAccessibility, summarizeIssues, isAccessible } from 'modern-pdf-lib';

const issues = checkAccessibility(doc);

// Summarise by severity
const { errors, warnings, infos, total } = summarizeIssues(issues);
console.log(`${errors} errors, ${warnings} warnings, ${infos} infos`);

// Quick boolean check (true = no errors)
if (!isAccessible(issues)) {
  for (const issue of issues.filter(i => i.severity === 'error')) {
    console.error(`[${issue.code}] ${issue.message}`);
  }
}

Checks Performed

Check Code Severity Description
Structure tree present NO_STRUCT_TREE error /StructTreeRoot required by PDF/UA
Document language NO_LANG error /Lang required for screen readers
Language format INVALID_LANG warning Must be a valid BCP 47 tag
Document title NO_TITLE warning Descriptive title improves usability
Tagged content NO_TAGGED_CONTENT warning Structure tree exists but no MCIDs assigned
Untagged page UNTAGGED_PAGE info Page has no tagged elements
Reading order READING_ORDER info MCIDs not in ascending order on a page
Heading hierarchy HEADING_SKIP warning Heading levels must not skip (e.g. H1 → H3)
First heading HEADING_START warning First heading should be H1
Table rows TABLE_NO_ROWS error Table has no TR children
Table headers TABLE_NO_HEADERS warning Table has no TH cells
TR cells TR_NO_CELLS error TR has no TH or TD children
Cell placement CELL_NOT_IN_TR error TH/TD must be direct children of TR
Figure alt text FIGURE_NO_ALT error Figure / Formula / Form requires alt text
List items LIST_NO_ITEMS error L (list) has no LI children
LI body LI_NO_BODY warning LI should contain an LBody child
Empty document NO_PAGES info Document has no pages

AccessibilityIssue

interface AccessibilityIssue {
  severity:  'error' | 'warning' | 'info';
  code:      string;                          // e.g. 'FIGURE_NO_ALT'
  message:   string;
  element?:  PdfStructureElement;             // Related structure element
  pageIndex?: number;                          // Zero-based page index
}

PDF/UA Integration

PDF/UA-1 (ISO 14289-1) imposes additional requirements beyond a structure tree. The checks listed above map directly to PDF/UA rules. For full PDF/UA compliance you should also:

  1. Set the document language — doc.setLanguage('en-US')
  2. Set a meaningful title — doc.setTitle('...')
  3. Ensure every Figure, Formula, and Form element has altText
  4. Maintain heading hierarchy (no skipped levels)
  5. Mark all non-structural content as artifacts
  6. Embed all fonts (no standard 14 shortcuts)
  7. Set Marked: true in the document's /MarkInfo dictionary (done automatically by createStructureTree)

See also: PDF-A Compliance for archival requirements that often accompany PDF/UA.

Clone this wiki locally