OCR-D/ocrd_olena

ocrd-olena-binarize silently ignores non-existent input file group

Closed this issue · 4 comments

For any workspace, a non-existent input file group is silently ignored, without an error:

% ocrd-olena-binarize -I DOES-NOT-EXIST -O OCR-D-IMG-BIN 
% 

Other processors report a much more useful error, e.g.:

Exception: Invalid input/output file grps:
        Input fileGrp[@USE='OCR-D-IMG-BIN'] not in METS!

#75 is somewhat related because it still creates the (empty) output directory.

I believe bashlib's ocrd__parse_argv needs to do something along the lines of ocrd.decorators.ocrd_cli_wrap_processor here (which calls WorkspaceValidator.check_file_grp on the workspace and filegrps).

A little reasoning why this important: We have a "workflow script" that runs binarization and then layout segmentation. Because the user used the wrong input file group, no binarization was generated and - confusingly - the layout segmentation failed.

With the fix now available in core (and hopefully also ocrd_all) I'd say we can close. Let me know @mikegerber if the problem persists.