Working example error
Closed this issue · 9 comments
Hi, I'm working through the cBioPortalData vignette and am having trouble getting this query to work:
cbio <- cBioPortal()
acc <- cBioPortalData(
api = cbio,
by = "hugoGeneSymbol",
studyId = "acc_tcga",
genePanelId = "IMPACT341",
molecularProfileIds = c("acc_tcga_rppa", "acc_tcga_linear_CNA")
)
Here's the backtrace of the stuck R process:
Backtrace:
█
1. └─cBioPortalData::cBioPortalData(...)
2. ├─base::do.call(.portalExperiments, exargs)
3. └─(function (api, by, genePanelId, studyId, molecularProfileIds, ...
4. └─base::lapply(...)
5. └─cBioPortalData:::FUN(X[[i]], ...)
6. └─cBioPortalData::getDataByGenePanel(...)
7. └─cBioPortalData::molecularData(...)
8. └─cBioPortalData:::.invoke_bind(...)
9. └─cBioPortalData:::.bind_content(...)
10. └─dplyr::bind_rows(httr::content(x))
11. └─dplyr:::map(dots, function(.x) if (is.data.frame(.x)) .x else tibble(!!!.x))
12. └─base::lapply(.x, .f, ...)
13. └─dplyr:::FUN(X[[i]], ...)
14. └─tibble::tibble(!!!.x)
15. └─tibble:::tibble_quos(xs[!is.null], .rows, .name_repair)
16. └─tibble:::splice_dfs(output)
17. └─vctrs::vec_c(!!!x, .name_spec = "{inner}")
Here's the session info:
sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.0.2 (2020-06-22)
os macOS Catalina 10.15.6
system x86_64, darwin17.0
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2020-07-27
─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date lib source
AnnotationDbi 1.50.3 2020-07-25 [1] Bioconductor
AnVIL * 1.0.3 2020-05-04 [1] Bioconductor
askpass 1.1 2019-01-13 [1] CRAN (R 4.0.0)
assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0)
bb8 * 0.2.17 2020-07-25 [1] local
Biobase * 2.48.0 2020-04-27 [1] Bioconductor
BiocFileCache 1.12.0 2020-04-27 [1] Bioconductor
BiocGenerics * 0.34.0 2020-04-27 [1] Bioconductor
BiocParallel 1.22.0 2020-04-27 [1] Bioconductor
biomaRt 2.44.1 2020-06-17 [1] Bioconductor
Biostrings 2.56.0 2020-04-27 [1] Bioconductor
bit 1.1-15.2 2020-02-10 [1] CRAN (R 4.0.0)
bit64 0.9-7.1 2020-07-15 [1] CRAN (R 4.0.2)
bitops 1.0-6 2013-08-17 [1] CRAN (R 4.0.0)
blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.0)
cBioPortalData * 2.0.7 2020-07-03 [1] Bioconductor
cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0)
crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0)
curl 4.3 2019-12-02 [1] CRAN (R 4.0.0)
data.table 1.13.0 2020-07-24 [1] CRAN (R 4.0.2)
DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.0)
dbplyr 1.4.4 2020-05-27 [1] CRAN (R 4.0.0)
DelayedArray * 0.14.1 2020-07-14 [1] Bioconductor
digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0)
dplyr * 1.0.0 2020-05-29 [1] CRAN (R 4.0.0)
ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0)
fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0)
formatR 1.7 2019-06-11 [1] CRAN (R 4.0.0)
futile.logger 1.4.3 2016-07-10 [1] CRAN (R 4.0.0)
futile.options 1.0.1 2018-04-20 [1] CRAN (R 4.0.0)
generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0)
GenomeInfoDb * 1.24.2 2020-06-15 [1] Bioconductor
GenomeInfoDbData 1.2.3 2020-06-29 [1] Bioconductor
GenomicAlignments 1.24.0 2020-04-27 [1] Bioconductor
GenomicDataCommons 1.12.0 2020-04-27 [1] Bioconductor
GenomicFeatures 1.40.1 2020-07-08 [1] Bioconductor
GenomicRanges * 1.40.0 2020-04-27 [1] Bioconductor
glue 1.4.1 2020-05-13 [1] CRAN (R 4.0.0)
hms 0.5.3 2020-01-08 [1] CRAN (R 4.0.0)
httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2)
IRanges * 2.22.2 2020-05-21 [1] Bioconductor
jsonlite 1.7.0 2020-06-25 [1] CRAN (R 4.0.0)
lambda.r 1.2.4 2019-09-18 [1] CRAN (R 4.0.0)
lattice 0.20-41 2020-04-02 [2] CRAN (R 4.0.2)
lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0)
limma 3.44.3 2020-06-12 [1] Bioconductor
magrittr * 1.5 2014-11-22 [1] CRAN (R 4.0.0)
Matrix 1.2-18 2019-11-27 [2] CRAN (R 4.0.2)
matrixStats * 0.56.0 2020-03-13 [1] CRAN (R 4.0.0)
memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0)
MultiAssayExperiment * 1.14.0 2020-04-27 [1] Bioconductor
openssl 1.4.2 2020-06-27 [1] CRAN (R 4.0.2)
packrat 0.5.0 2018-11-14 [1] CRAN (R 4.0.0)
pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0)
prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0)
progress 1.2.2 2019-05-16 [1] CRAN (R 4.0.0)
purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0)
R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0)
RaggedExperiment 1.12.0 2020-04-27 [1] Bioconductor
rapiclient 0.1.3 2020-01-17 [1] CRAN (R 4.0.2)
rappdirs 0.3.1 2016-03-28 [1] CRAN (R 4.0.0)
RCircos 1.2.1 2019-03-12 [1] CRAN (R 4.0.2)
Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.2)
RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 4.0.0)
readr 1.3.1 2018-12-21 [1] CRAN (R 4.0.0)
RJSONIO 1.3-1.4 2020-01-15 [1] CRAN (R 4.0.2)
rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2)
Rsamtools 2.4.0 2020-04-27 [1] Bioconductor
RSQLite 2.2.0 2020-01-07 [1] CRAN (R 4.0.0)
rstudioapi 0.11 2020-02-07 [1] CRAN (R 4.0.0)
RTCGAToolbox 2.18.0 2020-04-27 [1] Bioconductor
rtracklayer 1.48.0 2020-04-27 [1] Bioconductor
rvest 0.3.6 2020-07-25 [1] CRAN (R 4.0.2)
S4Vectors * 0.26.1 2020-05-16 [1] Bioconductor
sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0)
stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0)
stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0)
SummarizedExperiment * 1.18.2 2020-07-09 [1] Bioconductor
survival 3.2-3 2020-06-13 [2] CRAN (R 4.0.0)
TCGAutils 1.8.0 2020-04-27 [1] Bioconductor
tibble 3.0.3 2020-07-10 [1] CRAN (R 4.0.2)
tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0)
vctrs 0.3.2 2020-07-15 [1] CRAN (R 4.0.2)
withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0)
XML 3.99-0.5 2020-07-23 [1] CRAN (R 4.0.2)
xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.0)
XVector 0.28.0 2020-04-27 [1] Bioconductor
yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0)
zlibbioc 1.34.0 2020-04-27 [1] Bioconductor
[1] /usr/local/koopa/opt/r/4.0/site-library
[2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
Rerunning this step I was able to generate this error:
Error: Can't subset columns that don't exist.
✖ Column `clinicalAttributeId` doesn't exist.
Backtrace:
█
1. └─cBioPortalData::cBioPortalData(...)
2. ├─base::do.call(clinicalData, clinargs)
3. └─(function (api, studyId = NA_character_) ...
4. ├─tidyr::pivot_wider(...)
5. └─tidyr:::pivot_wider.data.frame(...)
6. └─tidyr::build_wider_spec(...)
7. └─tidyselect::eval_select(enquo(names_from), data)
8. └─tidyselect:::eval_select_impl(...)
9. ├─tidyselect:::with_subscript_errors(...)
10. │ ├─base::tryCatch(...)
11. │ │ └─base:::tryCatchList(expr, classes, parentenv, handlers)
12. │ │ └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
13. │ │ └─base:::doTryCatch(return(expr), name, parentenv, handler)
14. │ └─tidyselect:::instrument_base_errors(expr)
15. │ └─base::withCallingHandlers(...)
16. └─tidyselect:::vars_select_eval(...)
17. └─tidyselect:::as_indices_sel_impl(...)
18. └─tidyselect:::as_indices_impl(x, vars, strict = strict)
19. └─tidyselect:::chr_as_locations(x, vars)
20. └─vctrs::vec_as_location(x, n = length(vars), names = vars)
21. └─(function () ...
22. └─vctrs:::stop_subscript_oob(...)
23. └─vctrs:::stop_subscript(...)
@mjsteinbaugh
Hi Michael,
What does BiocManager::valid()
give you?
Make sure it has the latest release installations.
> BiocManager::valid()
[1] TRUE
Also, I seem to be seeing some expected genes not return with this:
library(cBioPortalData)
gbm_tcga_pub2013 <- cBioDataPack("gbm_tcga_pub2013")
mat <- assay(gbm_tcga_pub2013, "RNA_Seq_v2_mRNA_median_Zscores")
hugo_genes %in% rownames(mat)
## [1] TRUE TRUE FALSE FALSE
I know the 2 FALSEs here should be TRUE here because they're on the cbioportal.org website and return with cgdsr package methods. Is there a potential gene symbol mapping issue here? I'm happy to help debug.
Hi Michael, @mjsteinbaugh
You may have old cache in your cache
location.
Try clearing your cache using the unlink
function call below.
unlink("~/.cache/cBioPortalData", recursive = TRUE)
suppressPackageStartupMessages(library(cBioPortalData))
cbio <- cBioPortal()
acc <- cBioPortalData(
api = cbio,
by = "hugoGeneSymbol",
studyId = "acc_tcga",
genePanelId = "IMPACT341",
molecularProfileIds = c("acc_tcga_rppa", "acc_tcga_linear_CNA")
)
#> harmonizing input:
#> removing 1 colData rownames not in sampleMap 'primary'
acc
#> A MultiAssayExperiment object of 2 listed
#> experiments with user-defined names and respective classes.
#> Containing an ExperimentList class object of length 2:
#> [1] acc_tcga_rppa: SummarizedExperiment with 57 rows and 46 columns
#> [2] acc_tcga_linear_CNA: SummarizedExperiment with 339 rows and 90 columns
#> Features:
#> experiments() - obtain the ExperimentList instance
#> colData() - the primary/phenotype DFrame
#> sampleMap() - the sample availability DFrame
#> `$`, `[`, `[[` - extract colData columns, subset, or experiment
#> *Format() - convert into a long or wide DFrame
#> assays() - convert ExperimentList to a SimpleList of matrices
sessionInfo()
#> R version 4.0.2 Patched (2020-07-19 r78887)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] parallel stats4 stats graphics grDevices utils datasets
#> [8] methods base
#>
#> other attached packages:
#> [1] cBioPortalData_2.0.7 MultiAssayExperiment_1.14.0
#> [3] SummarizedExperiment_1.18.2 DelayedArray_0.14.1
#> [5] matrixStats_0.56.0 Biobase_2.48.0
#> [7] GenomicRanges_1.40.0 GenomeInfoDb_1.24.2
#> [9] IRanges_2.22.2 S4Vectors_0.26.1
#> [11] BiocGenerics_0.34.0 AnVIL_1.0.3
#> [13] dplyr_1.0.0
#>
#> loaded via a namespace (and not attached):
#> [1] httr_1.4.2 tidyr_1.1.0
#> [3] bit64_0.9-7.1 jsonlite_1.7.0
#> [5] splines_4.0.2 assertthat_0.2.1
#> [7] askpass_1.1 TCGAutils_1.8.0
#> [9] highr_0.8 BiocFileCache_1.12.0
#> [11] blob_1.2.1 Rsamtools_2.4.0
#> [13] GenomeInfoDbData_1.2.3 RTCGAToolbox_2.18.0
#> [15] progress_1.2.2 yaml_2.2.1
#> [17] pillar_1.4.6 RSQLite_2.2.0
#> [19] lattice_0.20-41 glue_1.4.1
#> [21] limma_3.44.3 digest_0.6.25
#> [23] XVector_0.28.0 rvest_0.3.6
#> [25] htmltools_0.5.0 Matrix_1.2-18
#> [27] XML_3.99-0.5 pkgconfig_2.0.3
#> [29] biomaRt_2.44.1 zlibbioc_1.34.0
#> [31] purrr_0.3.4 RCircos_1.2.1
#> [33] rapiclient_0.1.3 BiocParallel_1.22.0
#> [35] openssl_1.4.2 tibble_3.0.3
#> [37] generics_0.0.2 ellipsis_0.3.1
#> [39] GenomicFeatures_1.40.1 survival_3.2-3
#> [41] RJSONIO_1.3-1.4 magrittr_1.5
#> [43] crayon_1.3.4 memoise_1.1.0
#> [45] evaluate_0.14 xml2_1.3.2
#> [47] prettyunits_1.1.1 tools_4.0.2
#> [49] data.table_1.13.0 hms_0.5.3
#> [51] formatR_1.7 lifecycle_0.2.0
#> [53] stringr_1.4.0 Biostrings_2.56.0
#> [55] AnnotationDbi_1.50.3 lambda.r_1.2.4
#> [57] compiler_4.0.2 rlang_0.4.7
#> [59] futile.logger_1.4.3 grid_4.0.2
#> [61] GenomicDataCommons_1.12.0 RCurl_1.98-1.2
#> [63] rappdirs_0.3.1 bitops_1.0-6
#> [65] rmarkdown_2.3 DBI_1.1.0
#> [67] curl_4.3 R6_2.4.1
#> [69] GenomicAlignments_1.24.0 rtracklayer_1.48.0
#> [71] knitr_1.29 bit_1.1-15.2
#> [73] futile.options_1.0.1 readr_1.3.1
#> [75] stringi_1.4.6 RaggedExperiment_1.12.0
#> [77] Rcpp_1.0.5 vctrs_0.3.2
#> [79] dbplyr_1.4.4 tidyselect_1.1.0
#> [81] xfun_0.16
Created on 2020-07-27 by the reprex package (v0.3.0)
@mjsteinbaugh I am not sure where you are getting hugo_genes
from.
If you encounter any issues with the actual data provided from the cBioPortal tarballs,
go to the https://github.com/cbioportal/datahub and open an issue there
Not a cache issue as far as I can tell. The hugo_genes
is a vector of genes of interest that I'd prefer to not post publicly at the moment.
I'll try running the vignette inside my Docker images and see if I can reprex. The error above may be a macOS-specific issue.
Hi Michael, @mjsteinbaugh
If there is an issue for Mac, feel free to open another issue.
Best,
Marcel