Error in umi@x : no applicable method for `@` applied to an object of class "matrix" with v2 regularization
Closed this issue · 9 comments
I downloaded an .h5 dataset of the Fetal Embryonic Brain from the HCA for reference mapping of my own dataset. I subsampled (memory requirements were too high for full data) 1000 cells out of the ~1.7e6 cells and converted the data into an .h5ad object in Python, which I then further converted into an .h5Seurat object using Convert()
in R. When specifying the parameters for SCTransform, I get an error when I set vst.flavor = "v2"
but when I use the default parameters the code runs without any issues.
This was the original code I ran.
data <- SCTransform(data, method = "glmGamPoi", vars.to.regress = "MitoFraction",
vst.flavor = "v2",
ncells = 1e4, variable.features.n = 5000)
I had the same error when running the same command as below.
data <- SCTransform(data, method = "glmGamPoi", vst.flavor = "v2")
The output and error from both commands is the following:
data <- SCTransform(data, method = "glmGamPoi", vst.flavor = "v2")
vst.flavor='v2' set, setting model to use fixed slope and exclude poisson genes.
Calculating cell attributes from input UMI matrix: log_umi
Total Step 1 genes: 11521
Total overdispersed genes: 7383
Excluding 4138 genes from Step 1 because they are not overdispersed.
Variance stabilizing transformation of count matrix of size 11521 by 1665
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 1665 cells
|======================================================================| 100%
Setting estimate of 8 genes to inf as theta_mm/theta_mle < 1e-3
# of step1 poisson genes (variance < mean): 0
# of low mean genes (mean < 0.001): 0
Total # of Step1 poisson genes (theta=Inf; variance < mean): 8
Total # of poisson genes (theta=Inf; variance < mean): 4138
Calling offset model for all 4138 poisson genes
Found 11 outliers - those will be ignored in fitting/regularization step
Ignoring theta inf genes
Replacing fit params for 4138 poisson genes by theta=Inf
Error in umi@x :
no applicable method for `@` applied to an object of class "matrix"
Running either the default parameters or the same command as the first just without v2 works.
data <- SCTransform(data, method = "glmGamPoi", vars.to.regress = "MitoFraction",
ncells = 1e4, variable.features.n = 5000)
Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 11521 by 1665
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 1665 cells
|======================================================================| 100%
Found 102 outliers - those will be ignored in fitting/regularization step
Second step: Get residuals using fitted parameters for 11521 genes
|======================================================================| 100%
Computing corrected count matrix for 11521 genes
|======================================================================| 100%
Calculating gene attributes
Wall clock passed: Time difference of 25.07768 secs
Determine variable features
Place corrected count matrix in counts slot
Centering data matrix
|======================================================================| 100%
Set default assay to SCT
After my testing, it seems that the issue is the v2 regularization. I'm not sure why I'm encountering this error. I can't tell if it's the dataset that is the problem or if I'm missing something. Any help would be appreciated! Please let me know if you need any more information.
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
Matrix products: default
BLAS: /opt/R/4.3.1/lib/R/lib/libRblas.so
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0
[4] dplyr_1.1.2 purrr_1.0.1 readr_2.1.4
[7] tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.2
[10] tidyverse_2.0.0 future_1.33.0 sctransform_0.3.5.9006
[13] glmGamPoi_1.12.2 SeuratDisk_0.0.0.9020 SeuratObject_4.1.3
[16] Seurat_4.3.0.1
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 jsonlite_1.8.7
[3] magrittr_2.0.3 spatstat.utils_3.0-3
[5] zlibbioc_1.46.0 vctrs_0.6.3
[7] ROCR_1.0-11 spatstat.explore_3.2-1
[9] RCurl_1.98-1.12 S4Arrays_1.0.5
[11] htmltools_0.5.5 parallelly_1.36.0
[13] KernSmooth_2.23-21 htmlwidgets_1.6.2
[15] ica_1.0-3 plyr_1.8.8
[17] plotly_4.10.2 zoo_1.8-12
[19] igraph_1.5.0.1 mime_0.12
[21] lifecycle_1.0.3 pkgconfig_2.0.3
[23] Matrix_1.6-0 R6_2.5.1
[25] fastmap_1.1.1 GenomeInfoDbData_1.2.10
[27] MatrixGenerics_1.12.3 fitdistrplus_1.1-11
[29] shiny_1.7.4.1 digest_0.6.33
[31] colorspace_2.1-0 patchwork_1.1.2
[33] S4Vectors_0.38.1 tensor_1.5
[35] irlba_2.3.5.1 GenomicRanges_1.52.0
[37] progressr_0.13.0 timechange_0.2.0
[39] fansi_1.0.4 spatstat.sparse_3.0-2
[41] httr_1.4.6 polyclip_1.10-4
[43] abind_1.4-5 compiler_4.3.1
[45] bit64_4.0.5 withr_2.5.0
[47] MASS_7.3-60 DelayedArray_0.26.7
[49] tools_4.3.1 lmtest_0.9-40
[51] httpuv_1.6.11 future.apply_1.11.0
[53] goftest_1.2-3 glue_1.6.2
[55] nlme_3.1-162 promises_1.2.0.1
[57] grid_4.3.1 Rtsne_0.16
[59] cluster_2.1.4 reshape2_1.4.4
[61] generics_0.1.3 hdf5r_1.3.8
[63] gtable_0.3.3 spatstat.data_3.0-1
[65] tzdb_0.4.0 hms_1.1.3
[67] data.table_1.14.8 sp_2.0-0
[69] utf8_1.2.3 XVector_0.40.0
[71] BiocGenerics_0.46.0 spatstat.geom_3.2-4
[73] RcppAnnoy_0.0.21 ggrepel_0.9.3
[75] RANN_2.6.1 pillar_1.9.0
[77] later_1.3.1 splines_4.3.1
[79] lattice_0.21-8 survival_3.5-5
[81] bit_4.0.5 deldir_1.0-9
[83] tidyselect_1.2.0 miniUI_0.1.1.1
[85] pbapply_1.7-2 gridExtra_2.3
[87] IRanges_2.34.1 SummarizedExperiment_1.30.2
[89] scattermore_1.2 stats4_4.3.1
[91] Biobase_2.60.0 matrixStats_1.0.0
[93] stringi_1.7.12 lazyeval_0.2.2
[95] codetools_0.2-19 cli_3.6.1
[97] uwot_0.1.16 xtable_1.8-4
[99] reticulate_1.30 munsell_0.5.0
[101] Rcpp_1.0.11 GenomeInfoDb_1.36.1
[103] globals_0.16.2 spatstat.random_3.1-5
[105] png_0.1-8 parallel_4.3.1
[107] ellipsis_0.3.2 bitops_1.0-7
[109] listenv_0.9.0 viridisLite_0.4.2
[111] scales_1.2.1 fortunes_1.5-4
[113] ggridges_0.5.4 leiden_0.4.3
[115] crayon_1.5.2 rlang_1.1.1
[117] cowplot_1.1.1
Can you share a public link to your object (subset)?
Can you share a public link to your object (subset)?
Sure thing! Sorry about that. I should have included a public link to the object earlier. The folder should have both the .h5ad and .h5Seurat formats. Lmk if you have any problems accessing the folder.
I think it's related to either dataset size, sparsity, or both. I changed my subsample size to 10x larger and while v2 regularization doesn't work, sctransform
is able to run without providing an error.
Ignoring theta inf genes
Replacing fit params for 6125 poisson genes by theta=Inf
vst.flavor = 'v2' failed. Running without it.
Thanks for sharing the h5Seurat object. I am unable to reproduce the error at my end though. Is it the object that was causing error with v2?
obj <- LoadH5Seurat("fetalembryonicdata_sampled_1690922510.h5Seurat")
obj22 <- SCTransform(obj, vst.flavor="v2")
vst.flavor='v2' set, setting model to use fixed slope and exclude poisson genes.
Calculating cell attributes from input UMI matrix: log_umi
Total Step 1 genes: 26736
Total overdispersed genes: 24679
Excluding 2057 genes from Step 1 because they are not overdispersed.
Variance stabilizing transformation of count matrix of size 30229 by 18833
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 5000 cells
|========================================================| 100%
Setting estimate of 189 genes to inf as theta_mm/theta_mle < 1e-3
# of step1 poisson genes (variance < mean): 0
# of low mean genes (mean < 0.001): 3637
Total # of Step1 poisson genes (theta=Inf; variance < mean): 227
Total # of poisson genes (theta=Inf; variance < mean): 5437
Calling offset model for all 5437 poisson genes
Found 209 outliers - those will be ignored in fitting/regularization step
Ignoring theta inf genes
Replacing fit params for 5437 poisson genes by theta=Inf
Setting min_variance based on median UMI: 0.04
Second step: Get residuals using fitted parameters for 30229 genes
|========================================================| 100%
Computing corrected count matrix for 30229 genes
|========================================================| 100%
Calculating gene attributes
Wall clock passed: Time difference of 2.240513 mins
Determine variable features
Place corrected count matrix in counts slot
Centering data matrix
|========================================================| 100%
Set default assay to SCT
Thanks for sharing the h5Seurat object. I am unable to reproduce the error at my end though. Is it the object that was causing error with v2?
obj <- LoadH5Seurat("fetalembryonicdata_sampled_1690922510.h5Seurat") obj22 <- SCTransform(obj, vst.flavor="v2") vst.flavor='v2' set, setting model to use fixed slope and exclude poisson genes. Calculating cell attributes from input UMI matrix: log_umi Total Step 1 genes: 26736 Total overdispersed genes: 24679 Excluding 2057 genes from Step 1 because they are not overdispersed. Variance stabilizing transformation of count matrix of size 30229 by 18833 Model formula is y ~ log_umi Get Negative Binomial regression parameters per gene Using 2000 genes, 5000 cells |========================================================| 100% Setting estimate of 189 genes to inf as theta_mm/theta_mle < 1e-3 # of step1 poisson genes (variance < mean): 0 # of low mean genes (mean < 0.001): 3637 Total # of Step1 poisson genes (theta=Inf; variance < mean): 227 Total # of poisson genes (theta=Inf; variance < mean): 5437 Calling offset model for all 5437 poisson genes Found 209 outliers - those will be ignored in fitting/regularization step Ignoring theta inf genes Replacing fit params for 5437 poisson genes by theta=Inf Setting min_variance based on median UMI: 0.04 Second step: Get residuals using fitted parameters for 30229 genes |========================================================| 100% Computing corrected count matrix for 30229 genes |========================================================| 100% Calculating gene attributes Wall clock passed: Time difference of 2.240513 mins Determine variable features Place corrected count matrix in counts slot Centering data matrix |========================================================| 100% Set default assay to SCT
Oh did you try using glmGamPoi? I wonder if that could be the source of the error rather than the dataset since you weren't able to reproduce the error. I have some other scripts running currently, but I will test this out soon. Thanks for testing!
V2 relies on glmgampoi for estimation internally so that is going through as well. Let me know what step you are facing issues.
Could I see your sessionInfo() output? I'm still getting an umi@x: no applicable method
... error. It's failing at Replacing fit params for X poisson genes by theta=Inf
Replacing fit params for 6125 poisson genes by theta=Inf
Error in umi@x :
no applicable method for `@` applied to an object of class "matrix"
Here's my session info:
sessionInfo()
R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] SeuratDisk_0.0.0.9020 diffusionMap_1.2.0
[3] pheatmap_1.0.12 destiny_3.12.0
[5] princurve_2.1.6 variancePartition_1.28.9
[7] BiocParallel_1.32.6 limma_3.54.2
[9] patchwork_1.1.2 Matrix_1.5-4.1
[11] nnet_7.3-18 future.apply_1.10.0
[13] future_1.31.0 lubridate_1.9.2
[15] forcats_1.0.0 stringr_1.5.0
[17] dplyr_1.1.2 purrr_1.0.1
[19] readr_2.1.4 tidyr_1.3.0
[21] tibble_3.2.1 ggplot2_3.4.2
[23] tidyverse_2.0.0 SeuratObject_4.1.3
[25] Seurat_4.3.0
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 rtracklayer_1.58.0
[3] scattermore_0.8 ggthemes_4.2.4
[5] knitr_1.42 bit64_4.0.5
[7] irlba_2.3.5.1 DelayedArray_0.24.0
[9] rpart_4.1.19 data.table_1.14.8
[11] KEGGREST_1.38.0 RCurl_1.98-1.12
[13] doParallel_1.0.17 generics_0.1.3
[15] GenomicFeatures_1.50.4 BiocGenerics_0.44.0
[17] callr_3.7.3 RhpcBLASctl_0.23-42
[19] cowplot_1.1.1 RSQLite_2.3.1
[21] usethis_2.1.6 RANN_2.6.1
[23] proxy_0.4-27 bit_4.0.5
[25] tzdb_0.4.0 xml2_1.3.4
[27] spatstat.data_3.0-0 httpuv_1.6.9
[29] SummarizedExperiment_1.28.0 xfun_0.39
[31] hms_1.1.3 evaluate_0.20
[33] promises_1.2.0.1 restfulr_0.0.15
[35] DEoptimR_1.0-14 fansi_1.0.4
[37] progress_1.2.2 dbplyr_2.3.2
[39] caTools_1.18.2 DBI_1.1.3
[41] igraph_1.4.1 htmlwidgets_1.6.1
[43] spatstat.geom_3.0-6 stats4_4.2.2
[45] ellipsis_0.3.2 RSpectra_0.16-1
[47] backports_1.4.1 aod_1.3.2
[49] sparseMatrixStats_1.10.0 biomaRt_2.54.1
[51] deldir_1.0-6 MatrixGenerics_1.10.0
[53] vctrs_0.6.3 SingleCellExperiment_1.20.1
[55] Biobase_2.58.0 remotes_2.4.2
[57] TTR_0.24.3 ROCR_1.0-11
[59] abind_1.4-5 cachem_1.0.7
[61] RcppEigen_0.3.3.9.3 withr_2.5.0
[63] BSgenome_1.66.3 robustbase_0.99-0
[65] progressr_0.13.0 checkmate_2.2.0
[67] vcd_1.4-11 sctransform_0.3.5
[69] GenomicAlignments_1.34.1 xts_0.13.1
[71] prettyunits_1.1.1 goftest_1.2-3
[73] cluster_2.1.4 lazyeval_0.2.2
[75] laeken_0.5.2 crayon_1.5.2
[77] hdf5r_1.3.8 spatstat.explore_3.0-6
[79] pkgconfig_2.0.3 GenomeInfoDb_1.34.9
[81] nlme_3.1-162 pkgload_1.3.2
[83] devtools_2.4.5 rlang_1.1.1
[85] globals_0.16.2 lifecycle_1.0.3
[87] miniUI_0.1.1.1 filelock_1.0.2
[89] wiggleplotr_1.22.0 BiocFileCache_2.6.1
[91] polyclip_1.10-4 RcppHNSW_0.4.1
[93] matrixStats_0.63.0 lmtest_0.9-40
[95] carData_3.0-5 boot_1.3-28.1
[97] zoo_1.8-11 base64enc_0.1-3
[99] ggridges_0.5.4 processx_3.8.1
[101] rjson_0.2.21 png_0.1-8
[103] viridisLite_0.4.1 bitops_1.0-7
[105] KernSmooth_2.23-22 Biostrings_2.66.0
[107] blob_1.2.4 parallelly_1.34.0
[109] spatstat.random_3.1-3 remaCor_0.0.16
[111] S4Vectors_0.36.2 scales_1.2.1
[113] memoise_2.0.1 magrittr_2.0.3
[115] plyr_1.8.8 hexbin_1.28.3
[117] ica_1.0-3 gplots_3.1.3
[119] zlibbioc_1.44.0 compiler_4.2.2
[121] BiocIO_1.8.0 pertinent_0.0.0.9006
[123] RColorBrewer_1.1-3 pcaMethods_1.90.0
[125] lme4_1.1-33 fitdistrplus_1.1-8
[127] Rsamtools_2.14.0 cli_3.6.1
[129] XVector_0.38.0 urlchecker_1.0.1
[131] listenv_0.9.0 pbapply_1.7-0
[133] ps_1.7.5 htmlTable_2.4.1
[135] Formula_1.2-5 ggplot.multistats_1.0.0
[137] MASS_7.3-60 tidyselect_1.2.0
[139] stringi_1.7.12 glmGamPoi_1.11.7
[141] yaml_2.3.7 ggrepel_0.9.3
[143] grid_4.2.2 tools_4.2.2
[145] timechange_0.2.0 parallel_4.2.2
[147] rstudioapi_0.14 foreign_0.8-84
[149] foreach_1.5.2 gridExtra_2.3
[151] EnvStats_2.7.0 smoother_1.1
[153] scatterplot3d_0.3-44 Rtsne_0.16
[155] digest_0.6.31 shiny_1.7.4
[157] Rcpp_1.0.10 GenomicRanges_1.50.2
[159] car_3.1-2 broom_1.0.5
[161] later_1.3.0 RcppAnnoy_0.0.20
[163] AnnotationDbi_1.60.2 httr_1.4.5
[165] Rdpack_2.4 colorspace_2.1-0
[167] XML_3.99-0.14 fs_1.6.2
[169] tensor_1.5 ranger_0.15.1
[171] reticulate_1.28 IRanges_2.32.0
[173] splines_4.2.2 uwot_0.1.14
[175] spatstat.utils_3.0-1 sp_1.6-0
[177] plotly_4.10.1 sessioninfo_1.2.2
[179] xtable_1.8-4 jsonlite_1.8.4
[181] nloptr_2.0.3 R6_2.5.1
[183] Hmisc_5.1-0 profvis_0.3.7
[185] pillar_1.9.0 htmltools_0.5.4
[187] mime_0.12 glue_1.6.2
[189] fastmap_1.1.1 minqa_1.2.5
[191] VIM_6.2.2 class_7.3-21
[193] codetools_0.2-19 pkgbuild_1.4.0
[195] mvtnorm_1.2-2 utf8_1.2.3
[197] lattice_0.21-8 spatstat.sparse_3.0-0
[199] pbkrtest_0.5.2 curl_5.0.0
[201] leiden_0.4.3 gtools_3.9.4
[203] survival_3.5-5 rmarkdown_2.22
[205] munsell_0.5.0 e1071_1.7-13
[207] GenomeInfoDbData_1.2.9 iterators_1.0.14
[209] reshape2_1.4.4 gtable_0.3.3
[211] rbibutils_2.2.13
Sorry for the late response, but I compared with my sessionInfo earlier in the thread. It looks like we have a few differences but I'm not sure which one cause the problem. For now, I've worked around the issue. Thank you for the taking the time to respond!