/corrnet

Scripts to convert correlation and p-value matrices to edge list and network

Primary LanguagePythonMIT LicenseMIT

GitHub BuildStatus

corrnet

The provided scripts can be used to convert correlation and p-value matrices to an condensed edgelist. Following, the resulting edge list will be used to generate networks using the python package NetworkX. The edge list can be optional restricted to a given correlation treshold (-clim) and p-value threshold (-plim). Great tools for calculating correlation matrices (identifying co-occurence patterns), especially of microbiome communities are SparCC and FastSpar. Created networks are only a basic representation, advanced representations require adjustments of the code. The 'data' folder contains artificially generated correlation and p-value matrices for testing purposes.

Requirements

Prerequisites

The additional python packages below are required:

  • numpy
  • pandas
  • matplotlib
  • networkx

Argument description

Following input is required to run the script:

Argument Type Comment
-cm float tabular separated (.tsv)
-pm float tabular separated (.tsv)

Optional:

Argument Type Default
-clim numeric 0.7
-plim numeric 0.05
-dlim numeric 5
-out character current working directory
--prefix character firstproject

Usage

Parameter

-cm     correlation matrix
-pm     p-value matrix
-clim   Threshold of correlation matrix (Default: 0.7)
-plim   Threshold of p-value matrix (Default: 0.05)
-dlim   Threshold of outgoing connections (degree) of source nodes (Default: 5)
-out    Path for generated output. If not provided, current working directory is used.
-pre    Prefix used to name output folder and files (Default: corrnet)

Run

python3 corrnet.py -cm ./data/random_correlation.tsv -pm ./data/random_p_value.tsv

Output

  • adjancy list (columns: source, target, weight, pval, direction, color)
  • network plot