HelenaLC/muscat

Potential bug in the prepSim code

Closed this issue · 1 comments

Hi, I was going over the source code for prepSim to try to identify the issue I mentioned in the previous post. After estimating the cell and gene parameters, the code "# drop genes for which estimation failed":

`cs <- y$coefficients

i <- !rowAnyNAs(cs)

cs <- cs[i, , drop = FALSE]

x <- x[i, , drop = FALSE]`

However, when we try to add the dispersion to the x object, it did not remove the genes that are failed for estimation:

`ds <- y$dispersion

names(ds) <- rownames(x)

rowData(x)$disp <- ds`

I could not figure out why this would not cause any error when we run the example from the vignette. As after filtering, the gene number of x will probably no longer corresponding to the length of ds. When I tried to run this using the source code one step by one step, I would encounter the error saying XXX elements in value to replace YYY elements.

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base

other attached packages:
[1] SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0
[3] Biobase_2.54.0 GenomicRanges_1.46.0
[5] GenomeInfoDb_1.30.0 IRanges_2.28.0
[7] S4Vectors_0.32.2 BiocGenerics_0.40.0
[9] MatrixGenerics_1.6.0 matrixStats_0.61.0
[11] edgeR_3.36.0 limma_3.50.0
[13] SCF_4.1.0

loaded via a namespace (and not attached):
[1] locfit_1.5-9.4 Rcpp_1.0.7 lattice_0.20-45
[4] bitops_1.0-7 grid_4.1.2 zlibbioc_1.40.0
[7] XVector_0.34.0 Matrix_1.3-4 tools_4.1.2
[10] RCurl_1.98-1.5 DelayedArray_0.20.0 compiler_4.1.2
[13] GenomeInfoDbData_1.2.7

Any help? Thanks!

Thanks so much! Yes, I could reproduce the error now. It doesn't happen with the example in the documentation as the coefficient / dispersion estimation doesn't fail for any genes in that data. It's fixed now (available through the master and RELEASE_3_14 branches on GH & should be available through Bioconductor with the next build, which usually takes a couple days).