Setting/Accessing nested properties on entities without @id
Closed this issue · 1 comments
RO-Crates aim to write the metadata file as flattend and compacted JSON-LD. I have noticed some inconsistencies with enforcing this behavior in this library.
Currently, it is possible to write nested properties to an entity, even if this property is a dictionary that does not contain an @id
. Here is an example:
from rocrate.rocrate import ROCrate
crate = ROCrate()
# setting a nested property throws no error
crate.root_dataset["license"] = {
"@type": "CreativeWork",
"name": "CC-O",
"url": "https://spdx.org/licenses/CC0-1.0"
}
crate.write("mycrate")
While the crate is written to disk without errors, attempting to access such a nested property once set from an ROCrate object raises an exception:
crate.root_dataset["license"]
raises:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/projects/playground/venv/lib/python3.12/site-packages/rocrate/model/entity.py:91, in Entity.__getitem__(self, key)
90 try:
---> 91 id_ = entry["@id"]
92 except KeyError:
KeyError: '@id'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
Cell In[32], line 1
----> 1 crate.root_dataset["license"]
File ~/projects/playground/venv/lib/python3.12/site-packages/rocrate/model/entity.py:93, in Entity.__getitem__(self, key)
91 id_ = entry["@id"]
92 except KeyError:
---> 93 raise ValueError(f"no @id in {entry}")
94 else:
95 deref_values.append(self.crate.get(id_, id_))
ValueError: no @id in {'@type': 'CreativeWork', 'name': 'CC-O', 'url': 'https://spdx.org/licenses/CC0-1.0'}
To enforce consistency and adhere to the intended behaviour, I suggest to implement a check when assigning a property that makes sure that the @id
key is present when the property value is a dictionary. It could even be considered to check if the id corresponds to an entity present in the crate.
When accessing values from an RO Crate, it might make sense to be more forgiving and return the property if it has no id or the id does not correspond to an entity.
These changes would prevent the library from creating RO-Crates that cannot be accessed with the very same library afterwards.
To enforce consistency and adhere to the intended behaviour, I suggest to implement a check when assigning a property that makes sure that the
@id
key is present when the property value is a dictionary
Done in #193.
It could even be considered to check if the id corresponds to an entity present in the crate
This cannot be done. "a": {"@id": "b"}
properties are allowed even if there's no "b"
entity in the crate: for instance, "conformsTo": {"@id": "https://w3id.org/ro/crate/1.1"}
in the "ro-crate-metadata.json"
entity. Moreover, after #192, __setitem__
is called in Entity.__init__
, and the "b"
entity could be added to the crate after the "a"
one has been initialized.
When accessing values from an RO Crate, it might make sense to be more forgiving and return the property if it has no id or the id does not correspond to an entity
In theory we could generate a random id and add the entity to the crate, but then this would need to work recursively, we'd have to detect cycles etc. The code would become much more complex to essentially support something that's not meant to be supported. An input RO-Crate must be flattened, and if it's not the behavior is undefined.
These changes would prevent the library from creating RO-Crates that cannot be accessed with the very same library afterwards
Well, even after #193, one could do tricks like e._jsonld["p"] = {"k": "v"}
(in fact, I used this trick in the unit test in #193). But if you modify the _jsonld
attribute you should know what you're doing.