"did not converge" Error on cellbender3
Closed this issue · 8 comments
Dear scDblFinder developer,
This is a first time I am trying to use your tool. Unfortunately , I am getting an error and not sure how to fix it:
Running on Linux, Ubuntu with 250 RAM, CPU: 64, 3T free space
# 1.0 Validate assay version of the Seurat object - Assay-v5
cell_bender_seurat[["RNA"]] # Assay (v5) data with 36601 features for 77863 cell
# 1.1 Convert v5 to v3.
cell_bender_seurat[["RNA3"]] <- as(object = cell_bender_seurat[["RNA"]], Class = "Assay")
cell_bender_seurat
cell_bender_seurat[["RNA3"]] # Assay data with 36601 features for 77863 cells
# 1.2 Convert to sce
sce = as.SingleCellExperiment(cell_bender_seurat, assay ="RNA3")
sce
class: SingleCellExperiment
dim: 36601 75331
metadata(0):
assays(2): counts logcounts
rownames(36601): MIR1302-2HG FAM138A ... AC007325.4 AC007325.2
rowData names(0):
colnames(75331): L25_ACGCAGCCAAACAACA-1 L25_CGACCTTTCGATCCCT-1 ...
S55_GTTAAGCGTCTAGGTT-1 S55_TGCTACCGTCGCGTGT-1
colData names(23): orig.ident nCount_RNA ... clonotype_id ident
reducedDimNames(5): PCA INTEGRATED.CCA INTEGRATED.RPCA UMAP.CCA
UMAP.SCVI
mainExpName: RNA3
altExpNames(0):
# 1.3 Find doublets (multiple samples x8)
sce.standard <- scDblFinder(sce, samples = "orig.ident", BPPARAM=MulticoreParam(20)) # fails, error message above
_Error in manager$availability[[as.character(result$node)]] <- TRUE :
wrong args for environment subassignment
Error in serialize(data, node$con, xdr = FALSE) :
error writing to connection
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
did not converge--results might be invalid!; try increasing work or maxit
Stop worker failed with the error: wrong args for environment subassignment_
I'd appreciate any suggestions.
Thank you
sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS
Matrix products: default
BLAS/LAPACK: /data/bin/conda_env_location/PDX_manuscript_2023_v2/lib/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] BiocParallel_1.36.0 scDblFinder_1.16.0
[3] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0
[5] Biobase_2.62.0 GenomicRanges_1.54.1
[7] GenomeInfoDb_1.38.1 IRanges_2.36.0
[9] S4Vectors_0.40.2 BiocGenerics_0.48.1
[11] MatrixGenerics_1.14.0 matrixStats_1.2.0
[13] Seurat_5.0.1 SeuratObject_5.0.0
[15] sp_2.1-3
loaded via a namespace (and not attached):
[1] RcppAnnoy_0.0.22 splines_4.3.2
[3] later_1.3.2 BiocIO_1.12.0
[5] bitops_1.0-7 tibble_3.2.1
[7] polyclip_1.10-6 XML_3.99-0.16.1
[9] fastDummies_1.7.3 lifecycle_1.0.4
[11] edgeR_4.0.2 globals_0.16.2
[13] lattice_0.22-5 MASS_7.3-60
[15] magrittr_2.0.3 limma_3.58.1
[17] plotly_4.10.4 yaml_2.3.8
[19] metapod_1.10.0 httpuv_1.6.14
[21] sctransform_0.4.1 spam_2.10-0
[23] spatstat.sparse_3.0-3 reticulate_1.35.0
[25] cowplot_1.1.3 pbapply_1.7-2
[27] RColorBrewer_1.1-3 abind_1.4-5
[29] zlibbioc_1.48.0 Rtsne_0.17
[31] purrr_1.0.2 RCurl_1.98-1.14
[33] GenomeInfoDbData_1.2.11 ggrepel_0.9.5
[35] irlba_2.3.5.1 listenv_0.9.1
[37] spatstat.utils_3.0-4 goftest_1.2-3
[39] RSpectra_0.16-1 dqrng_0.3.2
[41] spatstat.random_3.2-2 fitdistrplus_1.1-11
[43] parallelly_1.36.0 DelayedMatrixStats_1.24.0
[45] leiden_0.4.3.1 codetools_0.2-19
[47] DelayedArray_0.28.0 scuttle_1.12.0
[49] tidyselect_1.2.0 ScaledMatrix_1.10.0
[51] viridis_0.6.5 spatstat.explore_3.2-6
[53] GenomicAlignments_1.38.0 jsonlite_1.8.8
[55] BiocNeighbors_1.20.0 ellipsis_0.3.2
[57] progressr_0.14.0 ggridges_0.5.6
[59] survival_3.5-7 scater_1.30.1
[61] tools_4.3.2 ica_1.0-3
[63] Rcpp_1.0.12 glue_1.7.0
[65] gridExtra_2.3 SparseArray_1.2.2
[67] dplyr_1.1.4 fastmap_1.1.1
[69] bluster_1.12.0 fansi_1.0.6
[71] digest_0.6.34 rsvd_1.0.5
[73] R6_2.5.1 mime_0.12
[75] colorspace_2.1-0 scattermore_1.2
[77] tensor_1.5 spatstat.data_3.0-4
[79] utf8_1.2.4 tidyr_1.3.1
[81] generics_0.1.3 data.table_1.14.10
[83] rtracklayer_1.62.0 httr_1.4.7
[85] htmlwidgets_1.6.4 S4Arrays_1.2.0
[87] uwot_0.1.16 pkgconfig_2.0.3
[89] gtable_0.3.4 lmtest_0.9-40
[91] XVector_0.42.0 htmltools_0.5.7
[93] dotCall64_1.1-1 scales_1.3.0
[95] png_0.1-8 scran_1.30.0
[97] reshape2_1.4.4 rjson_0.2.21
[99] nlme_3.1-164 zoo_1.8-12
[101] stringr_1.5.1 KernSmooth_2.23-22
[103] parallel_4.3.2 miniUI_0.1.1.1
[105] vipor_0.4.7 restfulr_0.0.15
[107] pillar_1.9.0 grid_4.3.2
[109] vctrs_0.6.5 RANN_2.6.1
[111] promises_1.2.1 BiocSingular_1.18.0
[113] beachmat_2.18.0 xtable_1.8-4
[115] cluster_2.1.6 beeswarm_0.4.0
[117] locfit_1.5-9.8 cli_3.6.2
[119] compiler_4.3.2 Rsamtools_2.18.0
[121] rlang_1.1.3 crayon_1.5.2
[123] future.apply_1.11.1 plyr_1.8.9
[125] ggbeeswarm_0.7.2 stringi_1.8.3
[127] viridisLite_0.4.2 deldir_2.0-2
[129] munsell_0.5.0 Biostrings_2.70.1
[131] lazyeval_0.2.2 spatstat.geom_3.2-8
[133] Matrix_1.6-1.1 RcppHNSW_0.6.0
[135] patchwork_1.2.0 sparseMatrixStats_1.14.0
[137] future_1.33.1 ggplot2_3.4.4
[139] statmod_1.5.0 shiny_1.8.0
[141] ROCR_1.0-11 igraph_1.6.0
[143] xgboost_2.0.3.1
Hi,
I've never seen this error, but this could be a memory and/or multithreading issue.
I'd recommend to check the following:
- monitor your RAM usage when running scDblFinder (e.g. using
htop
). - the package per se is not very memory hungry (it's been ran on much larger datasets), but the object itself can be, in particular earlier versions of
as.SingleCellExperiment
had a bug that made the object huge (although this should be solved in the version you're using). So check the size (e.g. usingformat(object.size(x), units="Gb")
of bothcell_bender_seurat
andsce
. If you see thatsce
is much bigger, you can always skip the conversion and run scDblFinder with something like:
sce <- scDblFinder(GetAssayData(cell_bender_seurat, slot="counts", assay="RNA3"),
samples=cell_bender_seurat$orig.ident)
- If from
htop
it does seem to be memory-related, try reducing the number of threads (or eventually using a single one).
thank you for the prompt response
- It looks normal (below 1%)
- seems ok
format(object.size(cell_bender_seurat), units="Gb") # "8 Gb"
format(object.size(sce), units="Gb") # "2.9 Gb"
A) Could it be something to do with how Seurat v.5 has layers ( 8 sample 8 layers for counts for example), and when I convert it to Array v.3 it becomes one matrix 36601 x 75331?
B) Tried to run without threads:
sce.standard <- scDblFinder(sce, samples = "orig.ident")
Warning messages:
1: In rpois(nrow(x) * length(wAd), as.numeric(as.matrix(x[, wAd]))) :
NAs produced
2: In value[[3L]](cond) :
Error in calculating norm factors:Error in .local(x, ...): size factors should be positive
C) Tried this too
sce <- scDblFinder(GetAssayData(cell_bender_seurat, slot="counts", assay="RNA3"),
samples=cell_bender_seurat$orig.ident)
Error in .checkSCE(sce) :
`sce` should be a SingleCellExperiment, a SummarizedExperiment, or an array (i.e. matrix, sparse matric, etc.) of counts.
In addition: Warning message:
The `slot` argument of `GetAssayData()` is deprecated as of SeuratObject 5.0.0.
ℹ Please use the `layer` argument instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
Not sure I understand your question A, the original Seurat object also has dimensions 36601 x 75331...
- Can you check
class(GetAssayData(cell_bender_seurat, layer="counts", assay="RNA3"))
? - Can you check
quantile(colSums(counts(sce)))
- Can you try this:
sce.standard <- scDblFinder(sce[VariableFeatures(cell_bender_seurat),], samples = "orig.ident")
class(GetAssayData(cell_bender_seurat, layer="counts", assay="RNA3"))
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
quantile(colSums(counts(sce)))
0% 25% 50% 75% 100%
201 650 2209 5732 81977
- It is running since 1 hr - I hope it is a good sign
sce.standard <- scDblFinder(sce[VariableFeatures(cell_bender_seurat),], samples = "orig.ident")
I'm unsure what's the issue here, but it appears to be related to 1) the fact that you have cells with a very low library size (your 201 is crap, personally I'd have filtered out many) and 2) the feature selection internal to scDblFinder might have resulted in some cells not having reads in those features. This appears to have been solved by using the VariableFeatures (which is a perfectly decent way of doing things), or would most likely also be solved by filtering out cells with a low library size (e.g. taking >=400-500).
If you want you can try again with multithreading, user either of these 2 solutions.
how long in average does it take to run scDblFinder ?
- its been ~5 hrs
- filtered out data, which eventually crashed
quantile(colSums(counts(sce)))
0% 25% 50% 75% 100%
451 1189 3332 6480 81977
sce.standard <- scDblFinder(sce, samples = "orig.ident", BPPARAM=MulticoreParam(8))
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
convergence criterion below machine epsilon
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
did not converge--results might be invalid!; try increasing work or maxit
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
convergence criterion below machine epsilon
Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, :
did not converge--results might be invalid!; try increasing work or maxit
Stop worker failed with the error: wrong args for environment subassignment
I figure out why I was getting that error, few steps back in my analysis:
I removed ambient RNA with Cell Bender v3, which generated negative values in the count matrix, that's why scDblFinder() was not able to process my data. The issue about cell bender generating a negative count matrix is discussed here htps://github.com/broadinstitute/CellBender/issues/306. To fix it run Cellbender v.2 re-run scDblFinder()
all works, quite quickly
Cheers.
Hi,
Great that we have an explanation, thanks for coming back on this.
I've now added in the devel version a check of that so that a more useful error message is provided.
Best,
Pierre-Luc