GSK_analysis

Xiaonan Wang

26May2020

Summary: Data analysis of the GSK project

Introduction

The analysis was done following the strategy below:

Two reference datasets were used for data projection:

Cord blood dataset (Unpublished)
Kuffman dataset (Unpublished)

Notebooks

This folder contains all jupyter notebooks that were used for the data analysis. To run the notebooks, the Smart-Seq2 preprocessng package would need to be installed first as

pip install smqpp

The main analysis include:

CBdata_all: Analysis of Cord blood data to generate visualisation layouts as references for projection.
Kuffman_all: Analysis of Kuffman data to generate visualisation layouts as references for projection of 0hr data.
MPB1234_all: Analysis of all MPB cells together, for better batch correction, data was split by days.
MPB_scGen: Batch correction and prediction of perturbation using scGen for MPB 0hr and 62 hr NT and GFP+ cells.
MPB1234_Day0: Analysis of MPB 0hr cells
MPB1234_Day3: Analysis of MPB 62hr cells
BM789_all: Analylsis of all BM cells together, for better batch correction, data was split by days
BM_scGen: Batch correction and prediction of perturbation using scGen for BM 0hr and 62 hr NT and GFP+ cells.
BM789_Day0: Analysis of BM 0hr cells
BM789_Day3: Analysis of BM 62hr cells

DPT analysis: