Implement id changes for GEOME

Question

Implement id changes for GEOME

Opened this issue 3 years ago · 15 comments

In looking over the examples for GEOME, I saw the following two difference where we need to pick the right thing to use:

For "sampleidentifier":

"sampleidentifier": "ark:/21547/Car2PIRE_0334",
"sampleidentifier": "http://n2t.net/ark:/21547/R2INDO119289"
"sampleidentifier": "LACM:DISCO:16924"

Do we want the n2t prefix, or the third variant which appears to have no scheme prefix?

Answer 1 · 2021-06-30T15:36:44.000Z

For "@id", it looks like we have the following variants:

"@id": "metadata/21547/Car2PIRE_0334"
"@id": "metadata:21547-eg2AB4OQ34"
"@id": "metadata:21547/CgZ2PEER_7055"
"@id": "ark:/21547/Cgx2MGH18_1_E4"

Which one should we use?

Answer 2 · 2021-06-30T15:51:27.000Z

Adding @datadavev to the ticket since he expressed interest in the morning standup.

Answer 3 · 2021-06-30T16:20:28.000Z

The @id value needs to conform to the JSON-LD spec ¹, which identifies it as a relative or absolute IRI.

This relates to #27

This also suggests that some required functionality of iSamples will be to resolve these @id values to the requested representation of the referenced resource.

https://www.w3.org/TR/json-ld/#syntax-tokens-and-keywords ↩

Answer 4 · 2021-06-30T17:27:00.000Z

One other thing it would be good to standardize on -- on some of the fields where we synthesize a bunch of fields from GEOME into one iSamples field, it would be good to be consistent on how we separate them. Right now, it looks like we choose between ; and |.

e.g.

'description': 'samplingProtocol: ARMS | expeditionCode: '
                               'INDO_PIRE | taxonomy team: MINV | taxonomy '
                               'team: 80'

vs.

 'description': 'sampling protocol:Dead Coral Head; '
                               'projectId:78 ; expeditionCode: PEER_2016',

Personally, I feel like | is easier to read.

Answer 5 · 2021-06-30T19:16:18.000Z

Sure, let's got with | (pipe char with white space either side).

Answer 6 · 2021-07-01T17:20:41.000Z

We discussed the philosophy about the @id property on the metadata record in one of the past tech meetings, and I think we agreed that the identifier for the metadata record about a sample should be different than the identifier for the physical sample itself. This follows the pattern that the TDWG MIDS group is following. It would suggest something like one of

"@id": "metadata/21547/Car2PIRE_0334"
"@id": "metadata:21547-eg2AB4OQ34"
"@id": "metadata:21547/CgZ2PEER_7055"

I don't think we're planning on registering arks for the metadata records, the idea would be if you dereference the identifier for the sample, what you get is the metadata record 'about' the sample. Based on some other comments I've seen, using 'metadata:' as a prefix is not a great idea-- it suggests that 'metadata' is a URI scheme (following RFC-3986 syntax), so something like metadata/21547/Car2PIRE_0334 seems like the best option.

As far as the sample identifier, my take would be that 'ark:/21547/Car2PIRE_0334' is the best option. 'http://n2t.net/ark:/21547/R2INDO119289' is a concatenation of 'http://n2t.net/' (a URL path to a resolver service) and the 'ark:...' part which is the actual identifier. That is the purist view, and assumes people will know how to resolve 'ark:' URIs....

Answer 7 · 2021-07-01T17:21:21.000Z

pipes are fine too

Answer 8 · 2021-07-01T17:48:35.000Z

The value of @id must be a relative or absolute IRI that can be used to retrieve the graph to which it is assigned. Only the entry "@id": "metadata/21547/Car2PIRE_0334" of those three examples is a valid relative IRI.

A JSON-LD processor will prepend the base of the document to create an absolute IRI from that value.

There is a pattern of interaction between the document identifiers and the resource provider (i.e. web server) that must be considered. Examples:

If the record is retrieved from the address:

https://isamples.org/metadata/21547/Car2PIRE_0334

And it has a relative @id value of metadata/21547/Car2PIRE_0334 then the computed absolute IRI will be:

https://isamples.org/metadata/21547/metadata/21547/Car2PIRE_0334

Retrieved from:

https://isamples.org/metadata/21547/Car2PIRE_0334/

The computed absolute IRI will be:

https://isamples.org/metadata/21547/Car2PIRE_0334/metadata/21547/Car2PIRE_0334

With a "@id":"." and retrieved from:

https://isamples.org/metadata/21547/Car2PIRE_0334

The computed absolute IRI will be:

https://isamples.org/metadata/21547/Car2PIRE_0334

Hence, depending on the way we want to access this information, the value of @id may well change.

Answer 9 · 2021-07-01T23:21:50.000Z

what if we use an IRI like 'isam:metadata/21547/Car2PIRE_0334' and map isam to whatever the resolver host is that we decide on using in the production system?

Answer 10 · 2021-07-01T23:45:10.000Z

Oh, right, OK. Like:

{
  "@context":{
    "isam":"https://isamples.org/service/",
    "is": "https://isamples.org/vocab/",
    "name":{
      "@id": "is:name"
    }
  },
  "@id":"isam:metadata/21547/Car2PIRE_0334",
  "name":"Some test record"
}

Which would expand to:

[
  {
    "@id": "https://isamples.org/service/metadata/21547/Car2PIRE_0334",
    "https://isamples.org/vocab/name": [
      {
        "@value": "Some test record"
      }
    ]
  }
]

And lets us update the resolver location by adjusting the context. Nice.

Answer 11 · 2021-07-01T23:46:52.000Z

Yea, that's what I was thinking. Will it work?

Answer 12 · 2021-07-01T23:47:32.000Z

Indeed it does: https://tinyurl.com/yzuh6qn4

Answer 13 · 2021-07-02T00:03:47.000Z

And here it is with a remote context:
https://tinyurl.com/b89twkvb

and a different version of the remote context with the target for isam adjusted:
https://tinyurl.com/bzzf62n6

The context docs are in a gist at: https://gist.github.com/datadavev/8c93a9551ac38473e53c8bc1c04b7c60

I like this as a solution since it provides a nice mechanism for adjusting the resolver location without having to touch the records, just update the context.

Answer 14 · 2021-07-02T21:29:25.000Z

can we go with this solution and close this issue?

Answer 15 · 2021-07-09T21:27:49.000Z

Thanks for the explanation, gentlemen. I'll move this one over to me for any implementation that's required.

Footnotes