nfdi4plants/isa-ro-crate-profile

What does `sameAs` point to in `LabProtocol`?

Closed this issue · 10 comments

Uri propery of ISA protocol is already covered by url field from schema.org thing

I'm not sure about this and can't remember why we added both. Could be that one is meant to point to a file within an ARC that describes the protocol whereas the other points to a true external resource like a website? Do you remember, @stuzart? Is it even a realistic scenario that both exist?

@floWetzels yes, that's how I remember it too, it was to point to an actual file documenting the protocol. I don't think we were too sure about it at the time, but left it as it is. Either way, the description is wrong and just copied from shema.org. I also checked the original google doc and bioschema LabProtocol proposed changes and it's the same, and probably where it was copied from.

An alternative to @sameAs might be hasPart, from CreativeWork via HowTo, which could point to a dataset (or File in ro-crate) representing the file, just as we do for Assays ?

So are you saying that sameAs is necessary and something else than url? Because your description "an actual file documenting the protocol" basically matches what the description of url in the profile says. I'm a bit confused.
@stuzart

The problem with url is that in an RO-Crate if you are including an actual file, it needs to referenced as a File (an alias for MediaObject) which wouldn't be compatible with url, but would be with hasPart. url could be used if pointing to something elsewhere, e.g. protocols.io but not if you just want to include a doc or pdf file.

It may even be clearer if we just remove url and use hasPart for both cases. The RO-Crate docs describe using File for both files or web based entities https://www.researchobject.org/ro-crate/1.1/data-entities.html

I don't think that we need to consider any file as a data entity. As far as I can tell from the docs, we are free to choose if a file or a directory is data entity. It basically becomes one by connecting it to the root data entity through hasPart. So it should be perfectly fine to link to other files or external sources via other properties. Am I wrong here?

Actually I've looked into it a bit more and yes, your're right, we just need url . If it is a file, we just need to give the filename as the @id , and use a @type of both File and Labprotocol e.g

{
      "@id": "my-lovely-protocol.pdf",
      "@type": [
        "File",
        "LabProtocol"
      ],
      ....
}

and then reference my-lovely-protocol.pdf from the root entity with hasPart.

So from the profile we can drop @sameAs and don't anything to replace it.

Sorry, I keep forgetting this ro-crate convension of using the @id as the file reference rather than a more explicit property.

So we would have 2 distinct ways to reference a digital protocol resource:

  • If it is a file in the RO-Crate/ARC, we use hasPart (or @id suffices?)
  • If it is a external web resource, we use uri

Or did I get this wrong?

it's a bit confusing and I'm trying to get to clarification. The spec itself is known to be confusing, and is trying to be improved to ro-crate 1.2. But the gist of it is that:

  • if its a file then in the ro-crate the @id should be the path to the file, relative to the root
  • if its an extenal web resource, then the @id should be the URL
  • if its mixed, both a file and available online then @id should be the file, and url a download link (in 1.2 changing to contentUrl for the download link)

in all cases there must be a hasPart referencing @id.

In our case, where it is an external web resource, I don't see any harm in also include url to point to it, which I think would make it fit with the bioschema profile and I'd imagine make it easier for parsers.

My suggestion is that we include url but make it none mandatory, but recommended for an external web source. At the same time, the bioschema for LabProtocol should replace sameAs with url, (which is where it originally came from).

So in the ro-crate, for a file it would appear as:

{
      "@id": "my-lovely-protocol.pdf",
      "@type": [
        "File",
        "LabProtocol"
      ],
      ....
}

external web resource:

{
      "@id": "https://somewhere.com/my-lovely-protocol.pdf",
      "@type": [
        "File",
        "LabProtocol"
      ],
      url: "https://somewhere.com/my-lovely-protocol.pdf"
      ....
}

and if mixed, then

{
      "@id": "my-lovely-protocol.pdf",
      "@type": [
        "File",
        "LabProtocol"
      ],
      url: "https://somewhere.com/my-lovely-protocol.pdf"
      ....
}

The root ro-crate DataSet would include the following, but this isn't necessarily part of the Profile, but just part of the ro-crate spec:

"hasPart": [
        {
          "@id": "my-lovely-protocol.pdf"
        }
      ]

Fixed by PR #18