Proof-of-concept for extensible evaluation of (intermediate) results of an OCR workflow
make deps install
All evaluation functionality is provided by backends.
Every backend inherits from EvalBackend and must
implement a compare_files method, that accepts paths to and media types of
the Ground Truth and detection results, does the actual evaluation and returns
an EvalReport.
An EvalReport is a map of metrics to their resp. value and can be serialized
as JSON or CSV for further processing/analysis.
The glue code for running the backends is in
ocrmultieval.runner.py.
The ocrmultieval compare command line tool allows evaluating individual pages of GT
and detection with any of the available backends.
Usage: ocrmultieval compare [OPTIONS] {dinglehopper|ocrevalUAtion|PrimaTextEva
l|CorAsvAnnEval|CorAsvAnnCompare|OcrdSegmentEvalua
te|IsriOcreval} GT_FILE OCR_FILE
Options:
--gt-mediatype TEXT
--ocr-mediatype TEXT
--format [csv|json|yaml|xml]
-g, --pageId TEXT pageId to uniquely identify pages in a work
--help Show this message and exit.
The ocrd-ocrmultieval command line tool implments the OCR-D processor
API and can be used to process complete
workspaces.
Usage: ocrd-ocrmultieval [OPTIONS]
Evaluate
> Eval processor
Options:
-I, --input-file-grp USE File group(s) used as input
-O, --output-file-grp USE File group(s) used as output
-g, --page-id ID Physical page ID(s) to process
--overwrite Remove existing output pages/images
(with --page-id, remove only those)
-p, --parameter JSON-PATH Parameters, either verbatim JSON string
or JSON file path
-P, --param-override KEY VAL Override a single JSON object key-value pair,
taking precedence over --parameter
-m, --mets URL-PATH URL or file path of METS to process
-w, --working-dir PATH Working directory of local workspace
-l, --log-level [OFF|ERROR|WARN|INFO|DEBUG|TRACE]
Log level
-C, --show-resource RESNAME Dump the content of processor resource RESNAME
-L, --list-resources List names of processor resources
-J, --dump-json Dump tool description as JSON and exit
-h, --help This help message
-V, --version Show version
Parameters:
"backend" [string - "PrimaTextEval"]
Backend to use
Possible values: ["PrimaTextEval", "ocrevalUAtion", "dinglehopper",
"OcrdSegmentEvaluate", "IsriOcreval", "CorAsvAnnCompare"]
"format" [string - "csv"]
Output format
Possible values: ["csv", "json", "yaml", "xml"]
"config" [object]
Configuration to override default
Default Wiring:
['GT,OCR1'] -> ['GT_VS_OCR1']