Construction drawing set manager — automates sheet identification (OCR), file renaming, discipline organization, revision management, and compiled PDF set generation. Built for managing addenda and bulletins in multi-discipline construction projects.
When you receive a new addendum or bulletin as a batch of PDFs:
- OCR title blocks to identify sheet numbers and titles
- Rename files from generic names to
SHEET# TITLE.pdfformat - Organize into discipline folders (Mechanical, Plumbing, Civil, etc.)
- Build a new revision by copying the previous rev and replacing updated sheets
- Compile each discipline into a single merged PDF set
- Python 3.10+
- PyMuPDF — PDF rendering and merging
- Tesseract OCR + pytesseract — title block text recognition
- Pillow — image processing
pip install PyMuPDF pytesseract Pillowpython dwg_manager.py full-pipeline "20260401 Addendum C" "20260226 Rev 2 (Rev 1 + Addendum B)" "20260401 Rev 3 (Rev 2 + Addendum C)" --label "Rev 3"This will:
- OCR each PDF's title block and output
title_mapping.csv - Pause for you to review/fix OCR results in the CSV
- Rename, organize, build the new revision, and compile discipline sets
# 1. OCR title blocks -> produces title_mapping.csv for review
python dwg_manager.py read-titles "20260401 Addendum C"
# 2. Open title_mapping.csv in Excel, fix any OCR errors, save
# 3. Rename files from the corrected CSV
python dwg_manager.py rename "20260401 Addendum C" "20260401 Addendum C/title_mapping.csv"
# 4. Sort into discipline folders by sheet prefix
python dwg_manager.py organize "20260401 Addendum C"
# 5. Create new revision + compile discipline sets
python dwg_manager.py new-revision "20260226 Rev 2 (Rev 1 + Addendum B)" "20260401 Addendum C" "20260401 Rev 3 (Rev 2 + Addendum C)" --compile --label "Rev 3"python dwg_manager.py compile "20260226 Rev 2 (Rev 1 + Addendum B)" --label "Rev 2"| Command | Description |
|---|---|
read-titles |
Crop title blocks from PDFs, OCR to identify sheet numbers and titles. Outputs a CSV for review. |
rename |
Rename PDFs based on a reviewed CSV mapping file. |
organize |
Sort PDFs into discipline subfolders by sheet prefix. |
compile |
Create compiled PDF sets per discipline folder. Handles overlays too. |
new-revision |
Copy a base revision folder, replace sheets with addendum updates. |
full-pipeline |
Run everything end-to-end with a pause for CSV review. |
| Prefix | Discipline | Prefix | Discipline |
|---|---|---|---|
| A | Architectural Set | F | Fire Sprinkler Set |
| C | Civil Set | FS | Food Service Set |
| E | Electrical Set | ID | Interior Design Set |
| L | Landscaping Set | S | Structural Set |
| LT | Lighting Set | T | Technology Set |
| M | Mechanical Set | ||
| P | Plumbing Set |
The read-titles command crops the bottom-right corner of each PDF page (default: 28% width x 35% height) where construction drawing title blocks are typically located. It renders the crop at 2x resolution and runs Tesseract OCR to extract the sheet number and title. Results are saved to a CSV for human review before any renaming happens.
The new-revision command:
- Copies the entire base revision folder (preserving all unchanged sheets)
- Matches addendum sheets to base sheets by sheet number (e.g.,
M100,P109.3,LT-0-020.1) - Replaces matched sheets with the updated addendum version
- Adds any new sheets that didn't exist in the base revision
The organize and compile commands automatically detect and handle an Overlays subfolder, creating per-discipline overlay sets and a complete overlay compilation.
- OCR works best on clean, standard title blocks. Always review the CSV before renaming.
- Adjust crop area with
--crop-widthand--crop-heightflags onread-titles. - The
new-revisioncommand never modifies the base revision folder — it copies first. - Compiled sets are sorted alphabetically by filename within each discipline.
MIT