geneontology/gocamgen

Merge same-GP/same-term annotations to use one triple

Opened this issue · 1 comments

This can happen when there are multiple evidence for the same GP-to-term.

Will need to find existing triple, if exists, matched by GP and term classes (and relation, once we get that fancy) and add the evidence from each identical annotation on that same triple.

A WB example to work with is WB:WBGene00000903 GO:0040024, which has 8 lines in the current wb.gpad file.

Triple finding and merging appears to be working now in translate(). I now notice some evidence appear to be duplicated on edges, though the source GPAD lines for the evidence show a difference in their with/from columns:

WB	WBGene00000903	involved_in	GO:0040024	PMID:11677050|WB_REF:WBPaper00004963	ECO:0000316	WB:WBGene00000907		20060927	WB
WB	WBGene00000903	involved_in	GO:0040024	PMID:11677050|WB_REF:WBPaper00004963	ECO:0000316	WB:WBGene00000915		20060927	WB