Releases: Datashs/galica-postocr
Releases · Datashs/galica-postocr
Added methodological working paper and documentation.
galica-postocr v1.2
This release introduces a stabilized methodological working paper describing the design, epistemological rationale, and validation strategy of the galica-postocr pipeline for nineteenth-century French historical corpora from Gallica.
Added
- Working paper on explicit computational philology and post-OCR correction workflows
- Repository cleanup and improved project organization
Repository contents
- OCR post-correction scripts
- Development corpus
- Documentation
- Working paper sources and export formats
Citation
Please cite the Zenodo DOI associated with this release when referencing the software or accompanying paper.
Archival metadata update
This release finalizes archival and authorship metadata prior to long-term preservation and HAL dissemination.
Updates include:
- AUTHORS file integration;
- repository archival preparation for Software Heritage;
- metadata refinements and documentation updates.
Add JOSS draft PDF workflow
Adds JOSS submission infrastructure, codemeta metadata, and automated draft PDF generation.
Submission for Joss
v1.0.4 Merge branch 'main' of https://github.com/Datashs/galica-postocr
add comedata
add comedata
v1.0.2
Metadata and Zenodo integration update
Minor release after Zenodo integration.
No substantial pipeline modifications.
Gallica Post-OCR Pipeline v1.0
First stable public release of the Gallica post-OCR normalization pipeline.
Includes:
- modular correction scripts
- orchestrator pipeline
- corpus audit tools
- bilingual documentation
- reproducible test corpus