/COVID-19

The R/Python scripts for the analysis of single-cell RNA-seq data from COVID-19 patients

Primary LanguageJupyter NotebookBSD 2-Clause "Simplified" LicenseBSD-2-Clause

COVID-19

The R/Python scripts for the analysis of single-cell RNA-seq data from COVID-19 patients

1. Requirement

We analyzed the scRNA-seq data in a Linux system with R (version 3.6.1) and Python (version 3.6.8) enviroment. The following software and packages are also required:

software version enviroment
Cellranger 3.1.0 Linux
Seurat 3.1.4 R
dplyr 0.8.4 R
patchwork 1.0.0 R
scrublet 0.2.1 Python
scipy 1.0.0 Python
pandas 0.24.2 Python
matplotlib 3.0.3 Python
seaborn 0.9.0 Python

2. Installation

Users need to copy the scripts to the same path as the raw data folders (i.e. the path containing the "P1-1r1/", "P1-1r2/", "P1-2r1/", "P1-2r2/", "P2-1/", "P2-2/", and "P2-3/" folders), and run these scripts in R studio or Jupyter notebook.

3. Step by step analysis

3.1 Map the raw sequencing data to the genome reference

bash step0_map_raw_data_using_cellranger.sh

3.2 Detect doublet by using Scrublet

Run step1_detect_doublets_using_scrublet.ipynb in Jupyter notebook. This script decects doublets from the cells of patients and healthy controls respectively.

3.3 Integrate patients with healthy controls

Run step2_integrate_patients_and_healthy_controls.R in R studio. This script also cluster cells and perform UMAP analysis on the scRNA-seq data.

3.4 Plot UMAP diagram and marker gene expression violinplot

Run step3_plot_umap_and_marker_gene_expression.ipynb in Jupyter notebook. This script illustrates UMAP diagram for the cells and violinplot for the expression of marker genes. All diagrams will be automatically presented in the jupyter interface.

3.5 Generate DEGs between different disease stages

Run step4_find_DEGs_between_disease_stages.ipynb in Jupyter notebook. This script generates DEGs between different disease stages, for CD14 monocytes and effector CD8 T cells respectively. It also present the expression heatmap of DEGs in PNG format file.