bertsky/workflow-configuration

Produces mets:file/ID that are not a valid xs:ID with certain configurations

Closed this issue · 5 comments

kba commented

In particular, + and / are not allowed in mets:file/@ID.

What part of this repo do you mean in particular? The workflows makefilization itself, or ocrd-import?

kba commented

The makefilization, which can lead to fileGrps that contain e.g. Fraktur+Latin or file IDs concatentating fileGrp and ID with /. For example, https://github.com/OCR-D/assets/tree/master/data/kant_aufklaerung_1784-complex/data

The makefilization, which can lead to fileGrps that contain e.g. Fraktur+Latin

ah, I see, you mean the configuration examples. Yes, now that I know + is forbidden, I should rename these target fileGrps. (But I don't think the makefilization itself should do anything to check output fileGrps, just as ocrd process or the single-CLI decorator don't.)

or file IDs concatentating fileGrp and ID with /.

oh, how is that? You surely mean there are certain processors which are behaving that way, not the makefilization, right?

kba commented

oh, how is that? You surely mean there are certain processors which are behaving that way, not the makefilization, right?

tbh, I was just noting the issue with the "complex" sample while fixing it in OCR-D/assets. I'll check what exactly went awry with that particular workflow next week.

ah, I see, you mean the configuration examples. Yes, now that I know + is forbidden, I should rename these target fileGrps.

see 7edcb90