IMPROVEMENT: add detached-header datalink semantic to the oai adapter
Closed this issue · 9 comments
Add the #detached-header
semantic to the oai adapter so it can be digested by the oai metadata. Ultimately, it's the same implementation as for the #documentation
but it allows to use datalink with a wider range of semantics without being confined by the oai adapter capabilities.
https://www.ivoa.net/rdf/datalink/core/2022-01-27/datalink.html#detached-header
daiquiri/daiquiri/oai/adapter.py
Lines 243 to 276 in f6f352a
As an example for the added code
elif semantics == '#detached-header':
datalink['alternate_identifiers'].append({
'alternate_identifier': access_url,
'alternate_identifier_type': 'URL'
})
The alternative solution would be putting the #detached-header
into the relatedIdentifiers
instead
elif semantics == '#detached-header':
datalink['formats'].append(content_type)
datalink['related_identifiers'].append({
'related_identifier': access_url,
'related_identifier_type': 'URL',
'relation_type': 'IsSupplementedBy'
})
@kimakan That is a good question, I am not quite sure what the best option should be. You have a better understanding of datacite than me, what do you think would be more relevant?
After looking into the issue in more detail, I think that a alternateIdentifier
is more appropriate since it's pointing to the same resource essentially. AFAIK, the relatedIdentifier
should point to a different, related resource.
However, I would like to put the content_type
into the formats
to keep track of the alternative formats (currently, only the format of #this
is tracked).
elif semantics == '#detached-header':
datalink['formats'].append(content_type)
datalink['related_identifiers'].append({
'alternate_identifier': access_url,
'alternate_identifier_type': 'URL'
})
sounds sensible, please make a PR. I like the idea of keeping track of the format. And I agree with the arguments on alternate vs. related.
Alternate identifier is suppose to be an ID.
Suggestion: declare datalinkID there like
<alternateIdentifier "alternateIdentifierType"="datalink">datalinkID</>
Related identifier links to related resources like:
#preview (viewer): describes
#preview-image (related image): is suplemented by
#documentation (url to docs): is documented by
#auxilliary (url to relate dresources): raus of OAI
#detach-header (url to header file): is supplemented by
#this (url of the resource): IsDescribedBy
#progenitor (url(datalink) of resources used): IsDerivedFrom
potential extra semantics:
#auxilliary-table (table with further data): references
Additional note:
Currently, the title of the oai record generated from the datalink tables is rendered from the description
of the datalink entry with #doi
. It's sensible, but it should be ensured that the description is related to the object and not to the DOI itself.
Incorrect description: Digital object identifier (DOI) for the Table 1 from the Data Release 1
Correct description: Table 1 from the Data Release 1
I found a bug in the creation routine of the tap_schema.datalink
. The content_length
adopted from the custom datalink tables, e.g., datalink_doi
, are set to 0 if the value is None
which is incorrect. The value is allowed to be None
. In some cases it must be 'Noneif the
content_length` attribute doesn't make any sense.
daiquiri/daiquiri/datalink/adapter.py
Lines 84 to 94 in 8501f84
Correctly, the content_length
of the datalinks created automatically for all schemas and tables is set to None
.
daiquiri/daiquiri/datalink/adapter.py
Lines 133 to 143 in 8501f84