bertsky/workflow-configuration

Download images first

Closed this issue · 3 comments

wrznr commented

When using workflow-configuration, it is mandatory that the image files are physically present. I.e. you can not make use of ocrd's ability to download images ad-hoc given the corresponding entries in a METS file group. Downloading is best done via

$ ocrd workspace find -G USE --download

Where USE corresponds to the attribute of the file group you want to use as input. Maybe this should be added to the documentation?

Absolutely! This should be the first step for every workflow not starting with ocrd-import. I'll see to it this will also be prominent for custom/new configurations.

On the other hand, there is already gt.mk, which does exactly this for all known GT file groups (including OCR-D-IMG). So you could always do make -f gt.mk before anything else. But that's a different strategy, of course...

Anyway, fixed via 1dfe678.