This repository is still in development
The testing of R implementation of functions provided in the Python version is ongoing. This package is not fully documented yet. Please share any problems you encounter in the documentation or functionality of the code as an issue on GitHub. Thank you for your patience.
Contributions to the R implementation via pull request are welcome.
DoubletDetection is an R implementation of a package to detect doublets (technical errors) in single-cell RNA-seq count matrices.
To install DoubletDetection in R:
if(!require(devtools)){
install.packages("devtools") # If not already installed
}
devtools::install_github("TomKellyGenetics/DoubletDetection", ref = "r-implementation")
To run basic doublet classification:
library("DoubletDetection")
clf <- BoostClassifier$new()
# raw_counts is a genes by cells count matrix
labels = clf$fit(raw_counts)$predict()
raw_counts
is a scRNA-seq count matrix or data.frame (genes x cells).
Note that the dimensions of input matrix differs from the Python version.
labels
is a binary numerical vector with the value 1
representing a
detected doublet, 0
a singlet, and NA
an ambiguous cell.
The classifier works best when there are several cell types present in the data. Furthermore, it should be applied individually to each run in an aggregated count matrix.
These functions and methods (for the Reference Class) have been documented and can be accessed in the R help system. A vignette will be prepared using Rmarkdown in due course.
For the Python implementation, see the original repository: https://github.com/JonathanShor/DoubletDetection
See their jupyter notebook for an example on 8k PBMCs from 10x.
Data can be downloaded from the 10x website.
Please cite the R implementation as an R package using citation(DoubletDetection)
.
Adam J. Gayoso, Jonathan D. Shor, Ambrose J. Carr, and S. Thomas Kelly (2018). DoubletDetection: a package to detect doublets (technical errors) in single-cell RNA-seq count matrices. R package version 2.3.0 https://github.com/TomKellyGenetics/DoubletDetection"
Please acknowledge the original contributors when using the R implementation.
bioRxiv submission is in progress. Please refer to the Python Repository https://github.com/JonathanShor/DoubletDetection) for more details.
This project is licensed under the terms of the MIT license (in accordance with the license of the original repository).