saezlab/OmnipathR

integrating STRING or custom PPI

anu-bioinfo opened this issue · 2 comments

Hello,

To start , I want to say that OmnipathR has been incredibly helpful in my research. Thanks for creating this.

I have a issue regarding extracting PPI for certain proteins like Vcam1 for mouse. For the current study that I am doing, I am using CARNIVAL to predict causal networks. In our studies we have found Vcam1 to be strongly down regulated in RNA-seq data. But it is missing from the final CARNIVAL network. After some digging up, I realized that it is missing from the PPI network that I downloaded via OmnipathR using import_all_interactions function.

I was able to extract the interacting partners of Vcam1 from STRING PPI database. I was wondering, how to integrate these interactions with the other interactions downloaded from using OminpathR. Omnipath interactions include the activation/inhibition information while the STRING database don't provide this information.

Could you please help me out ?

Thanks in advance for any help...

Best,

Anupam

Hi Anupam,

Thank you for your kind words! :)

The mouse network of OmniPath is translated from human by orthology. In the human network, VCAM1 has 81 interactions, 20 of them are transcriptional regulators of the gene, the remaining 61 are PPI (import_all_interactions retrieves both). Out of those 61, most of them are only from Edmund Wang's signaling network, while 14 are from other sources, and 11 of those are supported by literature references. Many of these have +/- effect signs.

Indeed the same query for mouse returns no interactions. I checked how the ID translation happens in pypath, and found that VCAM1 is not translated to mouse. NCBI Homologene, the database where we get the orthology infomation from, correctly translates VCAM1 to Vcam1. However, Homologene does not contain UniProt IDs, so we translate 1) from human UniProt to human RefSeq NP and Entrez Gene IDs; 2) translate these IDs by orthology; and 3) translate the mouse RefSeq and Entrez to mouse UniProt. This last step fails, because Homologene gives us the Entrez ID 22329, and UniProt translates this to the Trembl ID Q3UPN1 instead of the SwissProt ID P29533. It means, our software is correct, but the database content is not suitable for this translation. In the future we will add Ensembl BioMart as another orthology translation database besides of Homologene, and hopefully this will improve the translation. Until then, I can suggest you to:

  1. Translate your data to human, and work with the human IDs in OmniPath and CARNIVAL

or:

  1. Take the VCAM1 human interactions from OmniPath, translate them by BioMart (I am sure there is some Bioconductor package for that) and add it to the mouse network.

I hope this helps.

Best,

Denes

Hello Denes,

Thanks a lot for your detailed and prompt response