pdfa

Command-line tool that converts regular PDF documents into PDF/A files using OCRmyPDF with built-in OCR.

Features

Wraps OCRmyPDF to generate PDF/A-2 compliant files with OCR enforced.
Accepts input/output paths along with configurable OCR language and PDF/A level.
Ships with tests, black, and ruff configurations for streamlined development.

Requirements

Python 3.11+
OCRmyPDF runtime dependencies (Tesseract, Ghostscript, etc.) installed on your system. Refer to the OCRmyPDF installation guide.

Ubuntu 24.04

Install the system dependencies with APT before setting up the virtual environment:

sudo apt update
sudo apt install python3-venv python3-pip tesseract-ocr tesseract-ocr-eng tesseract-ocr-deu ghostscript qpdf

Add extra tesseract-ocr-<lang> packages if you need OCR support for additional languages.

Getting Started

python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pdfa-cli --help

Tip: Activating the virtual environment adds .venv/bin to your PATH, so pdfa-cli is available directly.

Usage

pdfa-cli input.pdf output.pdf --language deu+eng --pdfa-level 3

This command converts input.pdf into a PDF/A file written to output.pdf, enforcing OCR with the specified Tesseract languages.

Testing

pytest

Project Layout

.
├── pyproject.toml
├── README.md
├── src
│   └── pdfa
│       ├── __init__.py
│       └── cli.py
└── tests
    ├── __init__.py
    └── test_cli.py

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src/pdfa		src/pdfa
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdfa

Features

Requirements

Ubuntu 24.04

Getting Started

Usage

Testing

Project Layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pdfa

Features

Requirements

Ubuntu 24.04

Getting Started

Usage

Testing

Project Layout

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages