Show source/evidence for features on mouse over
Closed this issue · 18 comments
Some of the features from UniProt have an evidence qualifier that includes a PubMed ID:
/evidence="ECO:0000269|PubMed:18257517"
In that case we should use that as the reference instead of the UniProt paper.
Yes definitely but that links to the comment in this this ticket:
"First, I think we need to extend our MOD data format file to include an "assigned by" column so that all of these have "assigned_by" UniProt"
so that we can capture the experimental source, and the source of the curation.
First, I think we need to extend our MOD data format file to include an "assigned by" column so that all of these have "assigned_by" UniProt
I've implemented that for the annotations created directly from the UniProt data file. That's the lipidation sites, glycosylation sites, disulfide bonds and modifications from pombe-embl/external_data/uniprot_data_from_api.tsv
.
Those annotations will have the assigned_by
"UniProt" property in Chado from Wednesday.
There are also some UniProt annotations that were put in pombe-embl/supporting_files/manual_so_term_annotations.tsv
. I need to do a bit investigation with SVN to work out which came from UniProt. I'll do that next.
After that I'll work on display the assigned_by in the display.
There are also some UniProt annotations that were put in pombe-embl/supporting_files/manual_so_term_annotations.tsv. I need to do a bit investigation with SVN to work out which came from UniProt. I'll do that next.
I've now added an "Assigned_by" column to manual_so_term_annotations.tsv
and filled it in as best I can. And I've changed the loading code to store that value in Chado.
Some of the features from UniProt have an evidence qualifier that includes a PubMed ID:
/evidence="ECO:0000269|PubMed:18257517"
I'm working on storing and displaying the PubMed ID .
Some features from UniProt have more than one evidence code and reference. For example csx1 / SPAC17A2.09c:
MOD_RES 42; /note="Phosphoserine; by MAPK sty1"; /evidence="ECO:0000269|PubMed:14633985, ECO:0000269|PubMed:18257517";
For now the code is just using the first evidence code and reference. Most of the processing and display code can only handle one evidence code and one reference per annotation so we would need to split these cases into multiple annotations if we want to capture them.
This is great. Will we be able to see the PubMed ID in the pop-up on the protein feature viewer? This would be very useful.
For now the code is just using the first evidence code and reference. Most of the processing and display code can only handle one evidence code and one reference per annotation so we would need to split these cases into multiple annotations if we want to capture them.
Ideally we would split them into separate annotations. That would probably mean that more were filtered anyway because the annotations already existed?
Ideally we would split them into separate annotations. That would probably mean that more were filtered anyway because the annotations already existed?
ignore! you have already done it!
Actually, I'm a bit confused. We already had the data from the Wilson-Grady paper so these annotations should be filtered as duplicate shouldn't they?
Will we be able to see the PubMed ID in the pop-up on the protein feature viewer?
I'll add it. It will be a little annoying that you can't click on it though.
Actually, I'm a bit confused. We already had the data from the Wilson-Grady paper so these annotations should be filtered as duplicate shouldn't they?
There is no filtering yet.
OK!
but which gene is the example?
but which gene is the example?
Do you mean the example in the screenshots? It was https://www.pombase.org/gene/SPAC26H5.10c (I think). Sorry, I should have included a link.
Will we be able to see the PubMed ID in the pop-up on the protein feature viewer?
Is the PubMed ID best? We could put the citation as well or instead?
Maybe the citation would be better than the PMID
I'll add it. It will be a little annoying that you can't click on it though.
we can chat about the longer term options on the next call. I have some vague ideas....