ocrd-cis-align --dump-json does not produce valid JSON
Closed this issue · 3 comments
Calling ocrd-cis-align --dump-json
with the Docker image ocrd/all:2020-12-28
gives the following standard output (notice the last three lines):
{
"executable": "ocrd-cis-align",
"categories": [
"Text recognition and optimization"
],
"steps": [
"recognition/post-correction"
],
"input_file_grp": [
"OCR-D-OCR-1",
"OCR-D-OCR-2",
"OCR-D-OCR-N"
],
"output_file_grp": [
"OCR-D-ALIGNED"
],
"description": "Align multiple OCRs and/or GTs"
}
11:13:38.440 CRITICAL root - getLogger was called before initLogging. Source of the call:
11:13:38.441 CRITICAL root - File "/build/ocrd_cis/ocrd_cis/align/cli.py", line 35, in __init__
11:13:38.441 CRITICAL root - self.log = getLogger('cis.Processor.Aligner')
This crashes OCR-D when calling ocrd process "cis-align …" …
:
Traceback (most recent call last):
File "/usr/bin/ocrd", line 33, in <module>
sys.exit(load_entry_point('ocrd', 'console_scripts', 'ocrd')())
…
File "/build/core/ocrd/ocrd/task_sequence.py", line 72, in validate
param_validator = ParameterValidator(self.ocrd_tool_json)
File "/build/core/ocrd/ocrd/task_sequence.py", line 53, in ocrd_tool_json
self._ocrd_tool_json = json.loads(result.stdout)
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 19 column 1 (char 312)
@bertsky No problem. I will likely not be able to test this until there is a Docker image. For the moment, I am using ocrd-cis-align
instead of ocrd process "cis-align …" …
.
I will likely not be able to test this until there is a Docker image.
You can pull from git repos even in the Docker images. In this case (where the Docker image already contains the PR branch #77, just not the current head):
docker run -it ocrd/all bash
cd /build
git -C ocrd_cis pull origin pull/77/head
That's it! (No need to re-install via pip, because modules are installed in editable mode now, and the recent changes did not affect anything other than source files.)
You can also make these changes permanent (to your local image) by using docker commit
...
For the moment, I am using
ocrd-cis-align
instead ofocrd process "cis-align …" …
.
Yes, that would work, but I also added a fix that makes ocrd-cis-align produce valid PAGE-XML again. (Without it, you won't be able to open output files down the pipeline with PageViewer.)