BioSchemas/specifications

Update MolecularEntity profile wrt identification options

ljgarcia opened this issue · 3 comments

All minimum properties from the released version https://bioschemas.org/profiles/MolecularEntity/0.5-RELEASE (identifier, name, url) have been removed (name and url) or moved to optional (identifier), changes reflected in the draft version 0.6 https://bioschemas.org/profiles/MolecularEntity/0.6-DRAFT

However, the changelog for DRAFT 0.6 does not coincide with the actual changes:

  • anyOf name,inCHI, SMILES required: inChi was removed, inChiKey remains recommended; name and smiles were removed
  • sameAs removed from recommended: not true, all specifications in Bioschemas have sameAs as recommended
  • identifier moved from required to optional: this is correct

It is necessary to create a DRAFT 0.7 with a more accurate changelog. Please follow the tutorial https://bioschemas.org/tutorials/dde/update_profile in case of doubts on how to update a profile. Some of the expected changes in the changelog and actual changes in the profile are:

  • Add clarification on the Bioschemas description to all the "identity" properties that one of inChiKey or SMILES or name ( or is it iupacName*?, any other needed?) are minimum.
  • Keep sameAs as it is and remove any mention from the changelog
  • Keep identifier as it is, and remove any mention from the changelog
  • Document why url was removed (or bring it back and document that it had been removed by error)
  • If name is indeed no longer part of the profile, document that change (or bring it back and document that it had been removed by error)

Hi @egonw gentle reminder. Also please have a look at #583 and let us know if the property hierarchy there makes sense to you. Thanks

I just wanted to note that even if we decided to nest the properties, the Bioschemas site doesn't actually have pages for individual properties, so there's nowhere this would be displayed until the classes (and properties) are taken up by schema.org (where such a hierarchy) would be shown.

Suggestion:
Make identifier minimum and recommend the use of PropertyValue as range so value is used for the actual identifier and some other property (to be agreed by the broad community as the same approach can be used outside the chem types, for instance to clarify the nature of the ID --ORCID, PMID, DOI) to indicate whether it is inChi, inchiKey, SMILES
@egonw @sneumann any thoughts?