Configurable IdentifierIssuer for to_rdf
aucampia opened this issue · 0 comments
aucampia commented
I'm processing many JSON files as JSON-LD, currently I'm going via nquads to RDFLib, a problem I'm facing however is that blank node identifiers gets re-used for every document, and when I ingest them into an RDFLib graph different blank nodes look to be the same.
As a workaround I am doing this:
class IdentifierIssuer:
existing: typ.ClassVar[typ.Dict[str, str]] = {}
order: typ.ClassVar[typ.List[str]] = []
prefix = '_:bu'
def __init__(self, prefix: str):
cls = self.__class__
def get_id(self, old: typ.Optional[str] = None) -> str:
cls = self.__class__
# return existing old identifier
if old and old in cls.existing:
return cls.existing[old]
# get next identifier
id_ = cls.prefix + str(uuid.uuid4().hex)
# save mapping
if old is not None:
cls.existing[old] = id_
cls.order.append(old)
return id_
def has_id(self, old: str) -> bool:
cls = self.__class__
return old in cls.existing
jsonld.IdentifierIssuer = IdentifierIssuer
Would be nice if I could just pass it in as an argument to to_rdf