/masterPATH

MasterPATH is a exploratory network analysis method to uncover members of molecular pathways leading to the studied phenotype based on the results of functional genomics screening data.

Primary LanguageJava

masterPATH

MasterPATH is an exploratory network analysis method that employs the shortest path approach and centrality measure to uncover members of active molecular pathways leading to the studied phenotype based on the results of functional genomics screening data.

Databases

The method works with an integarted network that consists of interaction from the following databases: Human Integrated Protein-Protein Interaction rEference database (HIPPIE):

http://cbdm-01.zdv.uni-mainz.de/~mschaefer/hippie/download.php

Human Protein Reference Database (HPRD):

http://hprd.org/download

SignaLink 2.0 database:

http://signalink.org/download

SIGnaling Network Open Resource database (SIGNOR):

http://signor.uniroma2.it/downloads.php

tFactS database

http://www.tfacts.org/TFactS-new/TFactS-v2/index1.html

TransmiR database:

http://www.cuilab.cn/transmir

miRTarBase database

http://mirtarbase.mbc.nctu.edu.tw/php/download.php

Installation

The easiest way to use masterPATH is to download masterPATH.jar from JAR/ folder.

Otherwise the repo can be cloned into a new e.g. Netbeans IDE project and can be built.

The source files are available in the src/masterPATH folder.

Dependency libraries: -- commons-lang3-3.3.2

Usage

First, Wrapper and Network classes should be imported:

import masterPATH.Wrapper;
import masterPATH.Network;

Next, create a Wrapper and a Network objects :

Network nw ;
Wrapper wr = new Wrapper();

Next, load network :

 nw = wr.load_network(String file_nodes, file_interactions);

where String file_nodes is a full path to a file with network nodes,
String file_interactions is a full path to a file with network interactions. The prebuilt network files are available in the Networks/ folder.

Finally, perform the computaion for mixed directed and undirected network:

wr.find_shortest_paths_and_calculate_centrality(
        nw,
        file_hitlist,
        file_fimplementers,
        file_output,
        prefix,
        max_length_for_shortest_path,
        min_len_for_path,
        max_len_for_path,
        folder_for_random_paths,
        prefix_for_random_paths,
        number_of_permutations)

or for undirected network :

    wr.find_shortest_paths_and_calculate_centrality_ppi(
        nw,
        file_hitlist,
        file_fimplementers,
        file_output,
        prefix,
        max_length_for_shortest_path,
        min_len_for_path,
        max_len_for_path,
        folder_for_random_paths,
        prefix_for_random_paths,
        number_of_permutations)

where :

Network nw is a network object,

String file_hitlist is a full path to a file with hit genes,

String file_fimplementers is a full path to a file with "final implemmenters",

String file_output is a full path to an output file,

String prefix is a prefix for the shortest paths ids,

int max_length_for_shortest_path is maximum length for the breadth-first algorithm,

int min_len_for_path is minimum length of the paths for which centrality will be calculated,

int max_len_for_path is maximim length of the paths for which centrality will be calculated,

String folder_for_random_paths is a full path to folder where files for permutation analysis will be stored,

String prefix_for_random_paths is a prefix for the shortest paths ids for permuted hit lists,

int number_of_permutations is number of permutations.

Output files

Output files from the method are :

_file_output + paths_centrality file with paths information is a tab separated text file. File format : path id, path as a list of interaction ids (each interaction is separated by a tab), reserved field, centrality, hit gene-final implementer pairs that yield this path as a list of HGNC ids separated by semicolon, list of the shortest paths ids that yield this path separated by semicolon, reserved field, reserved field, HGNC id of the node in the path with maximum centrality, official symbol of the node in the path with maximum centrality, hit gene-final implementer pairs that yield this path as a list of official symbols separated by semicolon, path as a list of intercator1-interactor2 pairs (each interaction is separated by a tab), centrality, p-value.

_file_output + nodes_centrality file with nodes information is a tab separated text file. File format : node id, node ofiicial symbol, centrality, p-value.