Skip to content

Releases: Datashs/galica-postocr

Added methodological working paper and documentation.

13 May 07:07

Choose a tag to compare

galica-postocr v1.2

This release introduces a stabilized methodological working paper describing the design, epistemological rationale, and validation strategy of the galica-postocr pipeline for nineteenth-century French historical corpora from Gallica.

Added

  • Working paper on explicit computational philology and post-OCR correction workflows
  • Repository cleanup and improved project organization

Repository contents

  • OCR post-correction scripts
  • Development corpus
  • Documentation
  • Working paper sources and export formats

Citation

Please cite the Zenodo DOI associated with this release when referencing the software or accompanying paper.

Archival metadata update

11 May 15:46
a047732

Choose a tag to compare

This release finalizes archival and authorship metadata prior to long-term preservation and HAL dissemination.

Updates include:

  • AUTHORS file integration;
  • repository archival preparation for Software Heritage;
  • metadata refinements and documentation updates.

Add JOSS draft PDF workflow

10 May 21:00
ede3eb2

Choose a tag to compare

Adds JOSS submission infrastructure, codemeta metadata, and automated draft PDF generation.

Submission for Joss

10 May 20:40

Choose a tag to compare

v1.0.4

Merge branch 'main' of https://github.com/Datashs/galica-postocr

add comedata

10 May 19:28

Choose a tag to compare

add comedata

v1.0.2

10 May 18:32

Choose a tag to compare

Add Zenodo DOI references to French and English README files.

Metadata and Zenodo integration update

10 May 18:23

Choose a tag to compare

Minor release after Zenodo integration.

No substantial pipeline modifications.

Gallica Post-OCR Pipeline v1.0

10 May 17:29

Choose a tag to compare

First stable public release of the Gallica post-OCR normalization pipeline.

Includes:

  • modular correction scripts
  • orchestrator pipeline
  • corpus audit tools
  • bilingual documentation
  • reproducible test corpus