Implement id changes for GEOME
Opened this issue · 15 comments
In looking over the examples for GEOME, I saw the following two difference where we need to pick the right thing to use:
For "sampleidentifier":
"sampleidentifier": "ark:/21547/Car2PIRE_0334",
"sampleidentifier": "http://n2t.net/ark:/21547/R2INDO119289"
"sampleidentifier": "LACM:DISCO:16924"
Do we want the n2t prefix, or the third variant which appears to have no scheme prefix?
For "@id", it looks like we have the following variants:
"@id": "metadata/21547/Car2PIRE_0334"
"@id": "metadata:21547-eg2AB4OQ34"
"@id": "metadata:21547/CgZ2PEER_7055"
"@id": "ark:/21547/Cgx2MGH18_1_E4"
Which one should we use?
Adding @datadavev to the ticket since he expressed interest in the morning standup.
One other thing it would be good to standardize on -- on some of the fields where we synthesize a bunch of fields from GEOME into one iSamples field, it would be good to be consistent on how we separate them. Right now, it looks like we choose between ;
and |
.
e.g.
'description': 'samplingProtocol: ARMS | expeditionCode: '
'INDO_PIRE | taxonomy team: MINV | taxonomy '
'team: 80'
vs.
'description': 'sampling protocol:Dead Coral Head; '
'projectId:78 ; expeditionCode: PEER_2016',
Personally, I feel like |
is easier to read.
Sure, let's got with |
(pipe char with white space either side).
We discussed the philosophy about the @id property on the metadata record in one of the past tech meetings, and I think we agreed that the identifier for the metadata record about a sample should be different than the identifier for the physical sample itself. This follows the pattern that the TDWG MIDS group is following. It would suggest something like one of
"@id": "metadata/21547/Car2PIRE_0334"
"@id": "metadata:21547-eg2AB4OQ34"
"@id": "metadata:21547/CgZ2PEER_7055"
I don't think we're planning on registering arks for the metadata records, the idea would be if you dereference the identifier for the sample, what you get is the metadata record 'about' the sample. Based on some other comments I've seen, using 'metadata:' as a prefix is not a great idea-- it suggests that 'metadata' is a URI scheme (following RFC-3986 syntax), so something like metadata/21547/Car2PIRE_0334 seems like the best option.
As far as the sample identifier, my take would be that 'ark:/21547/Car2PIRE_0334' is the best option. 'http://n2t.net/ark:/21547/R2INDO119289' is a concatenation of 'http://n2t.net/' (a URL path to a resolver service) and the 'ark:...' part which is the actual identifier. That is the purist view, and assumes people will know how to resolve 'ark:' URIs....
pipes are fine too
The value of @id
must be a relative or absolute IRI that can be used to retrieve the graph to which it is assigned. Only the entry "@id": "metadata/21547/Car2PIRE_0334"
of those three examples is a valid relative IRI.
A JSON-LD processor will prepend the base of the document to create an absolute IRI from that value.
There is a pattern of interaction between the document identifiers and the resource provider (i.e. web server) that must be considered. Examples:
If the record is retrieved from the address:
https://isamples.org/metadata/21547/Car2PIRE_0334
And it has a relative @id
value of metadata/21547/Car2PIRE_0334
then the computed absolute IRI will be:
https://isamples.org/metadata/21547/metadata/21547/Car2PIRE_0334
Retrieved from:
https://isamples.org/metadata/21547/Car2PIRE_0334/
The computed absolute IRI will be:
https://isamples.org/metadata/21547/Car2PIRE_0334/metadata/21547/Car2PIRE_0334
With a "@id":"."
and retrieved from:
https://isamples.org/metadata/21547/Car2PIRE_0334
The computed absolute IRI will be:
https://isamples.org/metadata/21547/Car2PIRE_0334
Hence, depending on the way we want to access this information, the value of @id
may well change.
what if we use an IRI like 'isam:metadata/21547/Car2PIRE_0334' and map isam to whatever the resolver host is that we decide on using in the production system?
Oh, right, OK. Like:
{
"@context":{
"isam":"https://isamples.org/service/",
"is": "https://isamples.org/vocab/",
"name":{
"@id": "is:name"
}
},
"@id":"isam:metadata/21547/Car2PIRE_0334",
"name":"Some test record"
}
Which would expand to:
[
{
"@id": "https://isamples.org/service/metadata/21547/Car2PIRE_0334",
"https://isamples.org/vocab/name": [
{
"@value": "Some test record"
}
]
}
]
And lets us update the resolver location by adjusting the context. Nice.
Yea, that's what I was thinking. Will it work?
Indeed it does: https://tinyurl.com/yzuh6qn4
And here it is with a remote context:
https://tinyurl.com/b89twkvb
and a different version of the remote context with the target for isam
adjusted:
https://tinyurl.com/bzzf62n6
The context docs are in a gist at: https://gist.github.com/datadavev/8c93a9551ac38473e53c8bc1c04b7c60
I like this as a solution since it provides a nice mechanism for adjusting the resolver location without having to touch the records, just update the context.
can we go with this solution and close this issue?
Thanks for the explanation, gentlemen. I'll move this one over to me for any implementation that's required.