Benchmarking of doublet detection methods in single-cell RNA-seq data
-
Model parameter:
- r: default 2. The ratio of simulated cells
- K: sqrt(n_input_cell) / 2. The number of neighbours in KNN
Threshold parameter:
- doublet score: threshold to set. Can be optimised by Bayesian Gaussian mixture model, added in the wrapper function here.
-
Model parameter:
- pN: default 0.25. The ratio of simulated cells r = pN / (1 - pN). This parameter pN has been shown resistant in the paper
- pK: propotion of neighbours in KNN. Can be optimised with build-in function
Threshold parameter:
- nExp: number of expected to heterotypic doublets for threshold
- pANN: fraction of simulated doublet neighbours, threshold with nExp
See wrapper functions in the bin folder
- data from Demuxlet paper (Kang et al, 2018, Nature Biotech)
- bash script: run_demuxlet_data.sh
- jupyter notebook for analysis: demuxlet_dataset.ipynb