Working example error
Hi, I'm working through the cBioPortalData vignette and am having trouble getting this query to work:
cbio <- cBioPortal()
acc <- cBioPortalData(
api = cbio,
by = "hugoGeneSymbol",
studyId = "acc_tcga",
genePanelId = "IMPACT341",
molecularProfileIds = c("acc_tcga_rppa", "acc_tcga_linear_CNA")
Here's the backtrace of the stuck R process:
1. └─cBioPortalData::cBioPortalData(...)
2. ├─, exargs)
3. └─(function (api, by, genePanelId, studyId, molecularProfileIds, ...
4. └─base::lapply(...)
5. └─cBioPortalData:::FUN(X[[i]], ...)
6. └─cBioPortalData::getDataByGenePanel(...)
7. └─cBioPortalData::molecularData(...)
8. └─cBioPortalData:::.invoke_bind(...)
9. └─cBioPortalData:::.bind_content(...)
10. └─dplyr::bind_rows(httr::content(x))
11. └─dplyr:::map(dots, function(.x) if ( .x else tibble(!!!.x))
12. └─base::lapply(.x, .f, ...)
13. └─dplyr:::FUN(X[[i]], ...)
14. └─tibble::tibble(!!!.x)
15. └─tibble:::tibble_quos(xs[!is.null], .rows, .name_repair)
16. └─tibble:::splice_dfs(output)
17. └─vctrs::vec_c(!!!x, .name_spec = "{inner}")
Here's the session info:
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.0.2 (2020-06-22)
os macOS Catalina 10.15.6
system x86_64, darwin17.0
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2020-07-27
─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────
Rerunning this step I was able to generate this error:
Error: Can't subset columns that don't exist.
✖ Column `clinicalAttributeId` doesn't exist.
1. └─cBioPortalData::cBioPortalData(...)
2. ├─, clinargs)
3. └─(function (api, studyId = NA_character_) ...
4. ├─tidyr::pivot_wider(...)
5. └─
6. └─tidyr::build_wider_spec(...)
7. └─tidyselect::eval_select(enquo(names_from), data)
8. └─tidyselect:::eval_select_impl(...)
9. ├─tidyselect:::with_subscript_errors(...)
10. │ ├─base::tryCatch(...)
11. │ │ └─base:::tryCatchList(expr, classes, parentenv, handlers)
12. │ │ └─base:::tryCatchOne(expr, names, parentenv, handlers[[1L]])
13. │ │ └─base:::doTryCatch(return(expr), name, parentenv, handler)
14. │ └─tidyselect:::instrument_base_errors(expr)
15. │ └─base::withCallingHandlers(...)
16. └─tidyselect:::vars_select_eval(...)
17. └─tidyselect:::as_indices_sel_impl(...)
18. └─tidyselect:::as_indices_impl(x, vars, strict = strict)
19. └─tidyselect:::chr_as_locations(x, vars)
20. └─vctrs::vec_as_location(x, n = length(vars), names = vars)
21. └─(function () ...
22. └─vctrs:::stop_subscript_oob(...)
23. └─vctrs:::stop_subscript(...)
Hi Michael,
What does BiocManager::valid()
give you?
Make sure it has the latest release installations.
> BiocManager::valid()
[1] TRUE
Also, I seem to be seeing some expected genes not return with this:
gbm_tcga_pub2013 <- cBioDataPack("gbm_tcga_pub2013")
mat <- assay(gbm_tcga_pub2013, "RNA_Seq_v2_mRNA_median_Zscores")
hugo_genes %in% rownames(mat)
I know the 2 FALSEs here should be TRUE here because they're on the website and return with cgdsr package methods. Is there a potential gene symbol mapping issue here? I'm happy to help debug.
Hi Michael, @mjsteinbaugh
You may have old cache in your cache
Try clearing your cache using the unlink
function call below.
unlink("~/.cache/cBioPortalData", recursive = TRUE)
cbio <- cBioPortal()
acc <- cBioPortalData(
api = cbio,
by = "hugoGeneSymbol",
studyId = "acc_tcga",
genePanelId = "IMPACT341",
molecularProfileIds = c("acc_tcga_rppa", "acc_tcga_linear_CNA")
#> harmonizing input:
#> removing 1 colData rownames not in sampleMap 'primary'
#> A MultiAssayExperiment object of 2 listed
#> experiments with user-defined names and respective classes.
#> Containing an ExperimentList class object of length 2:
#> [1] acc_tcga_rppa: SummarizedExperiment with 57 rows and 46 columns
#> [2] acc_tcga_linear_CNA: SummarizedExperiment with 339 rows and 90 columns
#> Features:
#> experiments() - obtain the ExperimentList instance
#> colData() - the primary/phenotype DFrame
#> sampleMap() - the sample availability DFrame
#> `$`, `[`, `[[` - extract colData columns, subset, or experiment
#> *Format() - convert into a long or wide DFrame
#> assays() - convert ExperimentList to a SimpleList of matrices
#> R version 4.0.2 Patched (2020-07-19 r78887)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04 LTS
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/
#> locale:
#> attached base packages:
#> [1] parallel stats4 stats graphics grDevices utils datasets
#> [8] methods base
#> other attached packages:
#> [1] cBioPortalData_2.0.7 MultiAssayExperiment_1.14.0
#> [3] SummarizedExperiment_1.18.2 DelayedArray_0.14.1
#> [5] matrixStats_0.56.0 Biobase_2.48.0
#> [7] GenomicRanges_1.40.0 GenomeInfoDb_1.24.2
#> [9] IRanges_2.22.2 S4Vectors_0.26.1
#> [11] BiocGenerics_0.34.0 AnVIL_1.0.3
#> [13] dplyr_1.0.0
#> loaded via a namespace (and not attached):
#> [1] httr_1.4.2 tidyr_1.1.0
#> [3] bit64_0.9-7.1 jsonlite_1.7.0
#> [5] splines_4.0.2 assertthat_0.2.1
#> [7] askpass_1.1 TCGAutils_1.8.0
#> [9] highr_0.8 BiocFileCache_1.12.0
#> [11] blob_1.2.1 Rsamtools_2.4.0
#> [13] GenomeInfoDbData_1.2.3 RTCGAToolbox_2.18.0
#> [15] progress_1.2.2 yaml_2.2.1
#> [17] pillar_1.4.6 RSQLite_2.2.0
#> [19] lattice_0.20-41 glue_1.4.1
#> [21] limma_3.44.3 digest_0.6.25
#> [23] XVector_0.28.0 rvest_0.3.6
#> [25] htmltools_0.5.0 Matrix_1.2-18
#> [27] XML_3.99-0.5 pkgconfig_2.0.3
#> [29] biomaRt_2.44.1 zlibbioc_1.34.0
#> [31] purrr_0.3.4 RCircos_1.2.1
#> [33] rapiclient_0.1.3 BiocParallel_1.22.0
#> [35] openssl_1.4.2 tibble_3.0.3
#> [37] generics_0.0.2 ellipsis_0.3.1
#> [39] GenomicFeatures_1.40.1 survival_3.2-3
#> [41] RJSONIO_1.3-1.4 magrittr_1.5
#> [43] crayon_1.3.4 memoise_1.1.0
#> [45] evaluate_0.14 xml2_1.3.2
#> [47] prettyunits_1.1.1 tools_4.0.2
#> [49] data.table_1.13.0 hms_0.5.3
#> [51] formatR_1.7 lifecycle_0.2.0
#> [53] stringr_1.4.0 Biostrings_2.56.0
#> [55] AnnotationDbi_1.50.3 lambda.r_1.2.4
#> [57] compiler_4.0.2 rlang_0.4.7
#> [59] futile.logger_1.4.3 grid_4.0.2
#> [61] GenomicDataCommons_1.12.0 RCurl_1.98-1.2
#> [63] rappdirs_0.3.1 bitops_1.0-6
#> [65] rmarkdown_2.3 DBI_1.1.0
#> [67] curl_4.3 R6_2.4.1
#> [69] GenomicAlignments_1.24.0 rtracklayer_1.48.0
#> [71] knitr_1.29 bit_1.1-15.2
#> [73] futile.options_1.0.1 readr_1.3.1
#> [75] stringi_1.4.6 RaggedExperiment_1.12.0
#> [77] Rcpp_1.0.5 vctrs_0.3.2
#> [79] dbplyr_1.4.4 tidyselect_1.1.0
#> [81] xfun_0.16
Created on 2020-07-27 by the reprex package (v0.3.0)
@mjsteinbaugh I am not sure where you are getting hugo_genes
If you encounter any issues with the actual data provided from the cBioPortal tarballs,
go to the and open an issue there
Not a cache issue as far as I can tell. The hugo_genes
is a vector of genes of interest that I'd prefer to not post publicly at the moment.
I'll try running the vignette inside my Docker images and see if I can reprex. The error above may be a macOS-specific issue.
Hi Michael, @mjsteinbaugh
If there is an issue for Mac, feel free to open another issue.