hi_loadings() out-of-bounds error
fruce-ki opened this issue · 4 comments
I get the following when trying to extract the top N genes from a primary components object prepared with pcomp()
:
Error in exprTable[names(geneloadings_extreme), ] :
subscript out of bounds
The error happens only for a subset of the primary components and can be resolved by lowering the number N passed to the topN
parameter (I'm asking for just 25 genes out of 500 used in the PCA, but is some case that's already too many apparently).
I suspect that hi_loadings()
does some internal thresholding, and that in some cases where a component's contribution is low, the genes that pass the criteria are fewer than topN
, resulting in the error. If that is the case, a more useful behaviour would be to return the genes that do pass the criteria, with a warning that fewer than N genes were returned.
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] htmltools_0.3.6 ggdendro_0.1-20 plotly_4.9.0 ggrepel_0.8.1 gridExtra_2.3
[6] forcats_0.4.0 stringr_1.4.0 dplyr_0.8.3 purrr_0.3.2 readr_1.3.1
[11] tidyr_0.8.3 tibble_2.1.3 ggplot2_3.2.0 tidyverse_1.2.1 pcaExplorer_2.8.1
[16] topGO_2.34.0 SparseM_1.77 GO.db_3.7.0 AnnotationDbi_1.44.0 graph_1.60.0
[21] DESeq2_1.22.2 SummarizedExperiment_1.12.0 DelayedArray_0.8.0 BiocParallel_1.16.6 Biobase_2.42.0
[26] GenomicRanges_1.34.0 GenomeInfoDb_1.18.2 IRanges_2.16.0 S4Vectors_0.20.1 BiocGenerics_0.28.0
[31] matrixStats_0.54.0 data.table_1.12.2
loaded via a namespace (and not attached):
[1] readxl_1.3.1 backports_1.1.4 GOstats_2.48.0 Hmisc_4.2-0 NMF_0.21.0 plyr_1.8.4
[7] igraph_1.2.4.1 lazyeval_0.2.2 GSEABase_1.44.0 shinydashboard_0.7.1 splines_3.5.1 crosstalk_1.0.0
[13] gridBase_0.4-7 digest_0.6.20 foreach_1.4.7 magrittr_1.5 checkmate_1.9.4 memoise_1.1.0
[19] cluster_2.0.7-1 doParallel_1.0.14 limma_3.38.3 annotate_1.60.1 modelr_0.1.4 prettyunits_1.0.2
[25] colorspace_1.4-1 apeglm_1.4.2 rvest_0.3.4 blob_1.2.0 haven_2.1.1 xfun_0.8
[31] jsonlite_1.6 crayon_1.3.4 RCurl_1.95-4.12 genefilter_1.64.0 zeallot_0.1.0 survival_2.42-3
[37] iterators_1.0.12 glue_1.3.1 registry_0.5-1 gtable_0.3.0 zlibbioc_1.28.0 XVector_0.22.0
[43] Rgraphviz_2.26.0 scales_1.0.0 pheatmap_1.0.12 DBI_1.0.0 rngtools_1.4 bibtex_0.4.2
[49] Rcpp_1.0.2 emdbook_1.3.11 viridisLite_0.3.0 xtable_1.8-4 progress_1.2.2 htmlTable_1.13.1
[55] foreign_0.8-70 bit_1.1-14 Formula_1.2-3 DT_0.7 AnnotationForge_1.24.0 htmlwidgets_1.3
[61] httr_1.4.0 threejs_0.3.1 RColorBrewer_1.1-2 shinyAce_0.4.0 acepack_1.4.1 pkgconfig_2.0.2
[67] XML_3.98-1.20 nnet_7.3-12 locfit_1.5-9.1 labeling_0.3 tidyselect_0.2.5 rlang_0.4.0
[73] reshape2_1.4.3 later_0.8.0 cellranger_1.1.0 munsell_0.5.0 tools_3.5.1 cli_1.1.0
[79] generics_0.0.2 RSQLite_2.1.2 broom_0.5.2 evaluate_0.14 shinyBS_0.61 yaml_2.2.0
[85] knitr_1.23 bit64_0.9-7 RBGL_1.58.2 nlme_3.1-137 mime_0.7 xml2_1.2.0
[91] biomaRt_2.38.0 compiler_3.5.1 rstudioapi_0.10 png_0.1-7 geneplotter_1.60.0 stringi_1.4.3
[97] lattice_0.20-35 Matrix_1.2-14 vctrs_0.2.0 pillar_1.4.2 d3heatmap_0.6.1.2 bitops_1.0-6
[103] httpuv_1.5.1 R6_2.4.0 latticeExtra_0.6-28 promises_1.0.1 codetools_0.2-15 MASS_7.3-50
[109] assertthat_0.2.1 pkgmaker_0.27 Category_2.48.1 withr_2.1.2 GenomeInfoDbData_1.2.0 hms_0.5.0
[115] grid_3.5.1 rpart_4.1-13 coda_0.19-3 rmarkdown_1.14 bbmle_1.0.20 numDeriv_2016.8-1.1
[121] lubridate_1.7.4 shiny_1.3.2 base64enc_0.1-3
I tried to replicate this with the airway
demo dataset but could not encounter the error.
As you said, it can be something dataset-specific.
Since I do not do any thresholding (head
and tail
would handle it gently), I am thinking it can be because of non-matching names. Are you uploading a gene annotation table as well?
Ok, your explanation makes sense.
It sounds like a case when an ID could not get mapped to GeneSymbol (say), and the name assigned to it is then a NA
that R does not like when subsetting
Closing this, in case you feel it needs to be reopened, let me know 😉