biopragmatics/oquat

Keep prefixes in original capitalisation and allow filter by ontology

Closed this issue · 5 comments

This is cool:

https://cthoyt.com/oquat/unknowns/

Is it updated from time to time?

What would be cool:

  1. Keep the prefixes in the table in their original capitalization (eg MONDORULE not mondorule)
  2. Provide some way to filter the table by ontology source in which prefixes are used

Yes, this is updated periodically, and I have almost got an automated nightly update with github actions to work.

  1. Won't fix, since prefixes with different capitalization would not get grouped. All variants are normalized using the bioregistry.utils.norm() function in the bioregistry python module.
  2. Is that not the same as the link in the second column, which go to source-specifics view? e.g., if you want to know all of the issues in UBERON, you can link to https://cthoyt.com/oquat/unknowns/source/uberon

Note, in the source-specific page, it doesn't group different variants. E.g., UBERON uses both Bgee and BGEE.
Screen Shot 2022-06-02 at 00 16 17

Oh great, sorry, I missed that :) This is great, thank you very much. I will advise my teams to look at these pages to normalise prefixes.

One thing I am not sure about though: (Not part of this issue, so I will close)

Is the goal to either purge or register all "unknown" prefix usages? Like, is it your vision to basically make all the oquat tables empty?

Yes, exactly. My vision is either to register or purge all unknown usages and ultimately make this table empty. Same goes for the "invalid" usages tables as well (where invalid refers to local identifiers not matching expected regular expressions)

Alright! Let's do this then :D