seasoncloud/Clonalscope

Issue with RunCovCluster function

Opened this issue · 1 comments

Dear Author,

I hope this message finds you well. I am writing to bring an issue to your attention regarding the RunCovCluster function in the Clonalscope package. I have encountered an error related to sparse matrices and missing values while using this function. I am currently running spatial Visium (spaceranger) produced data.

Error message:

[celltype_pdach13.csv](https://github.com/seasoncloud/Clonalscope/files/11854260/celltype_pdach13.csv)
[features.tsv.gz](https://github.com/seasoncloud/Clonalscope/files/11854261/features.tsv.gz)
[barcodes.tsv.gz](https://github.com/seasoncloud/Clonalscope/files/11854262/barcodes.tsv.gz)
[matrix.mtx.gz](https://github.com/seasoncloud/Clonalscope/files/11854263/matrix.mtx.gz)


> set.seed(2022)
> Cov_obj <- RunCovCluster(
+   mtx = Input_filtered$mtx,
+   barcodes = Input_filtered$barcodes,
+   features = Input_filtered$features,
+   bed = bed,
+   celltype0 = celltype0,
+   var_pt = 0.99,
+   var_pt_ctrl = 0.99,
+   include = 'all',
+   alpha_source = 'all',
+   ctrl_region = NULL,
+   seg_table_filtered = seg_table_filtered,
+   size = size,
+   dir_path = dir_path,
+   breaks = 50,
+   prep_mode = 'intersect',
+   seed = 200
+ )
Error in intI(j, n = x@Dim[2], dn[[2]], give.dn = FALSE) : 
  'NA' indices are not (yet?) supported for sparse Matrices

I have examined the code snippet for the RunCovCluster function but was unable to determine the root cause of the error. It appears that the function struggles with sparse matrices or missing values. Unfortunately, the provided code snippet does not provide insight into the internal workings of the function that could help diagnose the issue.

To troubleshoot the problem, I have taken the following steps:

Verified that the input arguments, particularly the mtx matrix, conform to the function's requirements and are in the correct format, free from any missing values.
Reviewed the code within the RunCovCluster function for any potential limitations or operations that may not support sparse matrices or missing values.
Consulted the package documentation for guidance on handling sparse matrices and missing values, but no specific information regarding this issue was found.
Ensured that I am using the latest version of the package to benefit from any bug fixes or enhancements.
I kindly request your assistance in resolving this issue. If possible, could you provide guidance on how to handle sparse matrices and missing values within the RunCovCluster function? Alternatively, if an updated package version or a workaround for this issue exists, I would greatly appreciate any information you can offer.

Thank you for your time and attention to this matter. I look forward to your response and greatly value your efforts in maintaining and improving the clonalscope package.

Best regards,
ateeq khaliq

sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur/Monterey 10.16

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] Clonalscope_1.0.0    rtracklayer_1.58.0   GenomicRanges_1.50.2
 [4] GenomeInfoDb_1.34.9  IRanges_2.32.0       S4Vectors_0.36.2    
 [7] Matrix_1.5-3         Biobase_2.58.0       BiocGenerics_0.44.0 
[10] knitr_1.43           spacexr_2.2.0        Seurat_4.3.0        
[13] SeuratObject_4.1.3   sp_1.6-1            

loaded via a namespace (and not attached):
  [1] NMF_0.26                    plyr_1.8.8                 
  [3] igraph_1.4.3                lazyeval_0.2.2             
  [5] splines_4.2.0               BiocParallel_1.32.6        
  [7] listenv_0.9.0               scattermore_1.1            
  [9] ggplot2_3.4.2               gridBase_0.4-7             
 [11] digest_0.6.31               foreach_1.5.2              
 [13] htmltools_0.5.5             fansi_1.0.4                
 [15] magrittr_2.0.3              tensor_1.5                 
 [17] cluster_2.1.3               doParallel_1.0.17          
 [19] ROCR_1.0-11                 Biostrings_2.66.0          
 [21] globals_0.16.2              matrixStats_1.0.0          
 [23] spatstat.sparse_3.0-1       colorspace_2.1-0           
 [25] ggrepel_0.9.3               xfun_0.39                  
 [27] dplyr_1.1.2                 crayon_1.5.2               
 [29] RCurl_1.98-1.10             jsonlite_1.8.4             
 [31] scatterpie_0.1.9            progressr_0.13.0           
 [33] spatstat.data_3.0-1         survival_3.3-1             
 [35] zoo_1.8-12                  iterators_1.0.14           
 [37] glue_1.6.2                  polyclip_1.10-4            
 [39] registry_0.5-1              gtable_0.3.3               
 [41] nnls_1.4                    zlibbioc_1.44.0            
 [43] XVector_0.38.0              leiden_0.4.3               
 [45] DelayedArray_0.24.0         future.apply_1.11.0        
 [47] SingleCellExperiment_1.20.0 abind_1.4-5                
 [49] scales_1.2.1                DBI_1.1.3                  
 [51] rngtools_1.5.2              spatstat.random_3.1-5      
 [53] miniUI_0.1.1.1              Rcpp_1.0.10                
 [55] viridisLite_0.4.2           xtable_1.8-4               
 [57] reticulate_1.28             htmlwidgets_1.6.2          
 [59] httr_1.4.6                  SPOTlight_1.2.0            
 [61] RColorBrewer_1.1-3          ellipsis_0.3.2             
 [63] ica_1.0-3                   XML_3.99-0.14              
 [65] pkgconfig_2.0.3             farver_2.1.1               
 [67] uwot_0.1.14                 deldir_1.0-9               
 [69] utf8_1.2.3                  tidyselect_1.2.0           
 [71] rlang_1.1.1                 reshape2_1.4.4             
 [73] later_1.3.1                 munsell_0.5.0              
 [75] tools_4.2.0                 cli_3.6.1                  
 [77] generics_0.1.3              ggridges_0.5.4             
 [79] stringr_1.5.0               fastmap_1.1.1              
 [81] yaml_2.3.7                  goftest_1.2-3              
 [83] fitdistrplus_1.1-11         purrr_1.0.1                
 [85] RANN_2.6.1                  pbapply_1.7-0              
 [87] future_1.32.0               nlme_3.1-157               
 [89] mime_0.12                   compiler_4.2.0             
 [91] plotly_4.10.2               png_0.1-8                  
 [93] spatstat.utils_3.0-3        tibble_3.2.1               
 [95] tweenr_2.0.2                stringi_1.7.12             
 [97] lattice_0.20-45             vctrs_0.6.2                
 [99] pillar_1.9.0                lifecycle_1.0.3            
[101] BiocManager_1.30.20         spatstat.geom_3.2-1        
[103] lmtest_0.9-40               RcppAnnoy_0.0.20           
[105] data.table_1.14.8           cowplot_1.1.1              
[107] bitops_1.0-7                irlba_2.3.5.1              
[109] httpuv_1.6.11               patchwork_1.1.2            
[111] BiocIO_1.8.0                R6_2.5.1                   
[113] promises_1.2.0.1            KernSmooth_2.23-20         
[115] gridExtra_2.3               parallelly_1.36.0          
[117] codetools_0.2-18            MASS_7.3-56                
[119] SummarizedExperiment_1.28.0 rjson_0.2.21               
[121] withr_2.5.0                 GenomicAlignments_1.34.1   
[123] Rsamtools_2.14.0            sctransform_0.3.5          
[125] GenomeInfoDbData_1.2.9      mgcv_1.8-40                
[127] parallel_4.2.0              quadprog_1.5-8             
[129] grid_4.2.0                  ggfun_0.0.9                
[131] tidyr_1.3.0                 MatrixGenerics_1.10.0      
[133] Rtsne_0.16                  spatstat.explore_3.2-1     
[135] ggforce_0.4.1               STdeconvolve_1.3.1         
[137] shiny_1.7.4                 restfulr_0.0.15 

Hello @AteeqMKhaliq , thank you for being interested in our tool Clonalscope.

The error looks like related to sparse matrix format or matrix entry matching; we load the standard 10x count matrix with readMM() function, which return us with dgTMatrix. I also tried dgCMatrix and dense matrix of our example P5931, which all seem to work well. I think some straightforward solution worth trying are:

  • force the matrix to be dense matrix, such as :

setting mtx =as.matrix(Input_filtered$mtx) in the RunCovCluster().
Or setting dense_mat=T in the RunCovCluster().

  • Check if the barcodes and features match with the row and column names of Input_filtered$mtx matrix.
  • Check if the celltype0 annotation contains all the barcodes in Input_filtered$mtx (order does not matter here).

Lastly, we just updated the default parameters of HMM segmentation in Segmentation_Bulk step; would you mind reinstall the newest github version?
If the above debugging still do not work, I am happy to help with testing a subsampling of your deidentified data. Please let me know if it does not work.