A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools together to generate a full XML document.
-
Updated
Dec 24, 2025 - Python
A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools together to generate a full XML document.
Airflow pipeline for ScienceBeam related training and evaluation
A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools together to generate a full XML document. It is now mainly used for evaluation purpose of external tools.
Orchestrate ScienceBeam tasks for multiple datasets and tools (mostly for evaluation purpose)
Add a description, image, and links to the sciencebeam topic page so that developers can more easily learn about it.
To associate your repository with the sciencebeam topic, visit your repo's landing page and select "manage topics."