This code contains the code to reproduce the results shown in "Biological representation disentanglement of single-cell data" (2023).
- perturbation data (sci-Plex 3, Srivatsan et al. 2020):
- Anndata file used for the initial pre-processing notebook:
sciplex_complete_middle_subset.h5ad
(provided by Hetzel et al.). - Processed anndata file used for the analysis:
sciplex3_biolord.h5ad
.
- Anndata file used for the initial pre-processing notebook:
- perturbation data (Perturb-seq (1-gene), Adamson et al. 2016):
- Data used for the initial pre-processing notebook: Downloaded from here using GEARS within the notebook.
- Processed anndata file used for training a biolord model analysis:
adamson_biolord.h5ad
andadamson_single_biolord.h5ad
.
- perturbation data (Perturb-seq (2-gene), Norman et al. 2019):
- Data used for the initial pre-processing notebook: Download the data from here. Move the uncompressed
norman2019.tar.gz
folder to./data/perturbations/norman
. It should contain the subdirectorydata_pyg
. Moveessential_norman.pkl
andgo_essential_norman.csv
to./norman
, So./data/perturbations/norman
should containessential_norman.pkl, go_essential_norman.csv and norman2019/data_pyg/
. - Processed anndata file used for training a biolord model analysis:
norman_biolord.h5ad
andnorman_single_biolord.h5ad
.
- Data used for the initial pre-processing notebook: Download the data from here. Move the uncompressed
- fetal chromatin accessibility atlas (Domcke et al.):
- Data used for the initial pre-processing notebook: Domcke-2020.h5ad (provided by Cao et al.).
- Processed anndata file used for training a biolord model analysis: atac_biolord.h5ad.
- spatio-temporal infection dataset (Afriat et al.):
- Data used for the initial pre-processing notebook: 10.5281/zenodo.7081862 (provided by Afriat et al.)
- Processed anndata file used for the infection analysis:
adata_infected.h5ad
. - Processed anndata file used for the abortive classification analysis:
adata_abortive.h5ad
.