NatLibFi/bib-rdf-pipeline

Merging breaks when ISSN contains space

osma opened this issue · 1 comments

osma commented

One record (000045005) contains a space character in an ISSN. When this is converted into a URI in reconcile.rq (after #86) it becomes an invalid URI and breaks subsequent merge step:

sparql --data slices/fennica-00004-reconciled.nt --data refdata/fennica-work-transformations.nt --query sparql/merge.rq --out=NT >slices/fennica-00004-merged.nt
18:27:31 ERROR riot                 :: [line: 228784, col: 114] Bad character in IRI (space): <https://issn.org/resource/issn/0781-6[space]...>
Failed to load data
Makefile:87: recipe for target 'slices/fennica-00004-merged.nt' failed

Should check that the ISSN is syntactically valid when linking it to an issn.org URI.

osma commented

Still a problem when the space comes after the ISSN, e.g. record 000046711.