/scNET

jupyther notebook comparing reproducibility of single-cell network inference algorithms

Primary LanguageJupyter Notebook

Evaluating the reproducibility of single-cell gene regulatory network inference algorithms

We here benchamrk three single-cell network inference algorithms based on their reproducibility, i.e. their ability to infer similar networks once applied to two independent datasets from the same biological condition.

The benchmarked methods are:

The methods are tested in three biological contexes:

  1. human retina
  2. colorectal cancer (CRC) T-cells
  3. different cell types from hematopoiesis

Please note that the notebook executes by default the comparison in human retina. By uncommenting the lines relative to the biolgical context of interest in the cell corresponding to the data loading the user can however change the starting dataset.

Input data

As detailed in the paper, the data used for this benchmark are the following:

The preprocessed input data are available at https://cloud.biologie.ens.fr/index.php/s/JuJgrIL1jC6yZh4/download. Details on the preprocessing steps are provided in the methods of the paper.

To access all data:

  • Clone or download the scNET repository
  • From R terminal or Rstudio, run the following lines
setwd('../scNET/')
dataURL= 'https://cloud.biologie.ens.fr/index.php/s/JuJgrIL1jC6yZh4/download'
download.file(dataURL, 'scNET_data.zip')
unzip('scNET_data.zip')
  • In macOS environment, unzipping the data file from the terminal may be more efficient:
cd ~/scNET/
unzip scNET_dat.zip

Install the software environment

cd scNET
conda env create -f scNET.yml

Run the notebook

  • Enter the conda environment: conda activate scNET.
  • Launch the notebook with jupyter-notebook.

Cite the work

The preprint describing momix is available in BioRxiv https://www.biorxiv.org/content/10.1101/2020.11.10.375923v1

Additional Networks

Users can analyze the reproducibility of networks produced by other algorithms using this workflow. To do so, save two networks inferred with independant datasets into the scNET Results folder. Networks must be formatted into 3 columns (colum 1: gene1, column 2: gene2, column 3: interaction weight), in .tsv or tab seperated file format Then, run notebook section Algorithm reproducibility evaluationto calculate metrics.