FAIRMetrics/Metrics

Content type for schema.org metadata

mfenner opened this issue · 11 comments

DataCite content negotiation is using application/vnd.schemaorg.ld+json to ask for the DOI metadata expressed as schema.org JSON-LD. For example:

curl -LH "Accept: application/vnd.schemaorg.ld+json" https://doi.org/10.17864/1947.175

Crossref does not (yet) support content negotiation for schema.org.

The reason for not supporting application/ld+json is that other content besides schema.org might use that content type, and we have seen this in the DataCite content negotiation.

Interesting.... tricky problem... Do you know what CrossRef does in these cases? I want to think if there is a generic solution that will not require hard-coding to this case. If not, I'll just make an exception in the code.

So, to "get through", I would need to call once with the vnd.schemaorg accept headers, and then call the same URI again with a JSON-LD header, is that correct?

Use application/ld+json; profile="http://schema.org/"

(or whitespace-separated list of profile URIs)

https://tools.ietf.org/html/rfc6906

Not sure how that relates to the current problem... we're not defining a specific "kind" of json-ld, we're really dealing with a routing problem... unless I misunderstand your suggestion.

@markwilkinson , not sure what you mean with "get through".

If you use the "application/vnd.schemaorg.ld+json" accept header with a Crossref DOI, you will receive a 406 error, as they are not yet supporting metadata in schema.org format, and they handle unsupported formats differently (DataCite forwards the request to the URL registered for the DOI in the handle system).

OK, so in summary: You pass application/ld+json through to the end-provider, while CrossRef catches it and provides their own metadata. You provide DataCite metadata in response to that vendor-specific mime type.

II'll modify the code to deal with that- no problem!

Thanks for the heads-up! Sorry again for the delay!

M

Almost. The schema.org is generated by DataCite from DataCite metadata. Both Crossref and DataCite have a list of specific content types they support, with some overlap, e.g. BibTex.

Ummmm... I'm not able to map your correction onto my statement. Which part of what I said was incorrect?

Alright, I'll step away as I'm not sure of all of the background/context. What I wanted to mention was to use the profile parameter if you want to negotiate based on vocab(s). Perhaps that's not what's desired here.

Hey Sarven! I was very interested to learn about the profile parameter - and in fact, it probably IS useful in this scenario, to distinguish between the different core data elements provided by the different repositories! However, it wasn't useful for the current problem, which was that my call on a URL is getting "trapped" by the intermediary proxy (i.e. during the redirects of a DOI resolution), who responds with their metadata, before it gets to the final destination. That prevents me from requesting metadata directly from the end data provider (at least, for that same MIME type)

Martin: I've just done a bit of "playing" with your content negotiation. now I understand :-) It was not at all what I thought you were saying! (unfortunately... because I thought what you were saying was a really good solution to a problem! LOL!)

OK, so I need to code a more generic solution to this problem of both datacite and crossref "hijacking" the HTTP response before it ever gets to the final data provider. This actually wont change my code very much (more by good luck than good planning!) and it helps ensure that we are being as FAIR as possible to all participants in the DOI resolution process.

Thanks for bringing this to my attention!

The issue is now solved, as far as I am aware. The content negotiation on a doi.org/XXX URL happens using a "linked-data" set of headers, it then happens again with a star/star set of headers until I reaches the final redirect (i.e. the URL of the provider themselves). It then returns to a Linked Data set of headers for content negotiation on THAT URL, and carries on through json, xml, and finally star/star content negotiation at the final endpoint.

Closing this issue now. Cheers all!