geneontology/pathways2GO

The direction of a transport reaction is backwards in the new import

Closed this issue · 12 comments

ukemi commented

In this model: gomodel:R-HSA-70268 (Pyruvate metabolism)
There is a transport reaction: R-HSA-9012374 (VDAC1 transports PYR from cytosol to mitochondrial intermembrane space)
There reaction transports pyruvate from the [GO:0005829] cytosol to the [GO:0005758] mitochondrial intermembrane space. I looked at Reactome and the locations are correct wrt inputs and outputs, but in the latest import, the direction is backwards.

@ukemi It looks like this is due to the switch to ChEBI IDs for small molecules, away from the location-specific REACTO IDs.

Before the CHEBI switch, start location cytosol ID R-ALL-29398 and end location mitochondrial intermembrane space ID R-ALL-9012375 the Reactome IDs were being incorporated into the GO-CAM IRIs for PYR like:

<http://model.geneontology.org/R-ALL-29398_R-HSA-9012374>
<http://model.geneontology.org/R-ALL-9012375_R-HSA-9012374>

The CHEBI switch automatically meant that both of the above IRIs would change to this identical CHEBI-derived IRI <http://model.geneontology.org/CHEBI_15361_R-HSA-9012374>, causing an ID collision and confusing the conversion on which location was the input (start) and which was the output (end).

I'll need to play with how the IRIs are constructed to make them unique again.

ukemi commented

Argh. I see what you mean. Or perhaps deconstruct the IRI's to include a location in the OWL expression?

ukemi commented

Yay! Thanks @dustine32
Should I double check before I close? Are the models on production or development?

@ukemi They'll be in noctua-dev (very most likely) by the end of today

ukemi commented

Perfect! I will check and then give the thumbs up to push to production. if that's ok. @deustp01 do you mind if we push this bug fix between releases?

do you mind if we push this bug fix between releases?

If it works for Dustin, it's fine for me.

@ukemi @deustp01 The updated models are now in noctua-dev.

ukemi commented

It looks like in the above model, all of the transport reactions have been corrected. However, to be comprehensive I checked a few other transport pathways. For R-HSA-432040,R-HSA-432047 and R-HSA-917937all looks good. In R-HSA-432030 it looks like we are not representing transport as we are in the other cases. I wonder if this is because it is a channel activity rather than a transporter activity??? I will open a new ticket for this. I think for this ticket we can merge into production and close. Nice work @dustine32

@ukemi Reaction R-HSA-432074 in pathway R-HSA-432030 isn't detected as a transport reaction due to a much more boring reason than suspected: case sensitivity of "Glycerol" vs "glycerol" in the Reactome (and thus BioPAX) labels:
image
It turns out that exact matching on labels is used by the transport inference SPARQL query for confirming two individuals have the same type:


This used to be necessary when we used REACTO class types for small molecules since the types were location-specific. But, after we switched to using ChEBI classes we can probably get rid of this "labels must match" criteria. Does this sound good?

Formally a tangent, but even if GO-CAM can tolerate these mismatches, users at Reactome (both human and machine) would probably prefer that we cleaned them up. @dustine32 if it's easy can you generate a list of things that fail the "labels must match" test, even one with some false positives and false negatives?

ukemi commented

@dustine32 Yes, let's use the ChEBI identifiers. I suspected the reason was something like this. @deustp01 I agree clean-up even for minor things like case mis-matches is probably a good goal.

ukemi commented

I still think we can merge this and then close and move it over to the appropriate column in the project.