Document Layout Analysis

A compact document-layout analysis program designed to detect and normalize visual structure in scanned or photographed documents. It targets noisy real-world inputs (blurred, deskewed, low-resolution) and supports many languages.

Features

Detects page elements (text blocks, tables, figures, headings) using a YOLO model trained on the DocLayNet dataset.
Robust to blurred, deskewed, and low-resolution inputs.
Multi-language support for downstream OCR and analysis.
Input preprocessing with OpenCV: Hough transforms and common rotation/deskew operations to normalize orientation.
Produces structured layout outputs (bounding boxes + labels) and enhanced images for OCR.

Implementation (brief)

Detection model: YOLO trained on DocLayNet — capable of identifying most document element types.
Preprocessing: edge/Hough-based line detection and rotation correction; basic denoising and resizing for improving detection/OCR quality.
Typical outputs: annotated image, normalized crops per element, layout JSON for downstream processing.

Next steps

Integrate a dedicated OCR model to extract text from identified regions.
Add an image- and text-level summarization pipeline to summarize content found across detected elements.
Implement a reconstruction step to produce a cleaner, higher-resolution version of hard-to-identify documents (super-resolution + layout-aware compositing).

Goal

Reconstruct hard-to-identify documents into cleaner, higher-resolution versions while producing a structured layout and text output suitable for search, archival, and downstream NLP/summarization.

Minimal dependencies

OpenCV (preprocessing)
PyTorch/TensorFlow + YOLO implementation and weights (DocLayNet-trained)
Optional: OCR engine (Tesseract / OCR model), super-resolution model

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
docker		docker
models		models
test_code		test_code
utils		utils
.gitignore		.gitignore
README.md		README.md
collab_implementation.ipynb		collab_implementation.ipynb
concurrent_cpu_run.py		concurrent_cpu_run.py
deskew_clustering.py		deskew_clustering.py
docStructBench.py		docStructBench.py
draw_bounding_box.py		draw_bounding_box.py
hough_deskew.py		hough_deskew.py
inference.py		inference.py
technical.md		technical.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document Layout Analysis

Features

Implementation (brief)

Next steps

Goal

Minimal dependencies

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Document Layout Analysis

Features

Implementation (brief)

Next steps

Goal

Minimal dependencies

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages