bridgedb/datasources

Curate remaining entries to the Bioregistry

cthoyt opened this issue · 7 comments

After lots of careful curation, there are only four resources listed in this repository that I can't quite figure out

datasource_name system_code website_url linkout_pattern example_identifier entity_identified single_species identifier_type uri regex official_name wikidata_property bioregistry
Gramene Arabidopsis EnAt http://www.gramene.org/ http://www.gramene.org/Arabidopsis_thaliana/Gene/Summary?g=$id ATMG01360-TAIR-G gene Arabidopsis thaliana 1 EnAt AT[\dCM]G\d{5}-TAIR-G Gramene Arabidopsis nan nan
Gramene Maize EnZm http://www.ensembl.org http://www.maizesequence.org/Zea_mays/Gene/Summary?g=$id GRMZM2G174107 gene nan 1 EnZm nan Gramene Maize nan nan
Gramene Rice EnOj http://www.gramene.org/ http://www.gramene.org/Oryza_sativa/Gene/Summary?db=core;g=$id osa-MIR171a gene nan 1 EnOj nan Gramene Rice nan nan
Rice Ensembl Gene Os http://www.gramene.org/Oryza_sativa http://www.gramene.org/Oryza_sativa/geneview?gene=$id LOC_Os04g54800 gene Oryza sativa 1 Os nan Rice Ensembl Gene nan nan

Example URLs:

So the question is for the first two, what should we call these in Bioregistry? should they really get their own prefixes or is there a more general Gramene resolver for all of these IDs?

For the last two, can these be fixed? Maybe just need a new example from the same pattern.

@egonw these are the ones that aren’t complete

@mkutmon @Finterly : could you please check these databases, if they're used in any GPML?

And @tabbassidaloii : could you check if the databases above are part of our new BridgeDb mapping files?

We have mapping files for Arabidopsis thaliana (At), Zea mays (Zm), Oryza sativa japonica (Oj), and Oryza sativa indica (Oi).

egonw commented

We have mapping files for Arabidopsis thaliana (At), Zea mays (Zm), Oryza sativa japonica (Oj), and Oryza sativa indica (Oi).

@tabbassidaloii, but do we have mappings on those to Gramene?

FWIW I think these pathways were created by the Gramene team at the time.

Gramene

@egonw No, we don't. Not sure if BioMart provides it.