Reducing the number of package dependencies
Closed this issue · 6 comments
Hi Bioconductor team,
Is it possible to rework BiocFileCache a bit to not depend on quite so many tidyverse packages?
This is looking pretty heavy at the moment:
Depends | R (>= 3.4.0), dbplyr (>= 1.0.0)
Imports | methods, stats, utils, dplyr, RSQLite, DBI, filelock, curl, httr
AcidDevTools::packageDependencies("BiocFileCache")
## [1] "dbplyr" "methods" "stats" "utils" "dplyr"
## [6] "RSQLite" "DBI" "filelock" "curl" "httr"
## [11] "blob" "cli" "glue" "lifecycle" "magrittr"
## [16] "pillar" "purrr" "R6" "rlang" "tibble"
## [21] "tidyr" "tidyselect" "vctrs" "withr" "generics"
## [26] "jsonlite" "mime" "openssl" "bit64" "memoise"
## [31] "pkgconfig" "plogr" "cpp11" "bit" "cachem"
## [36] "tools" "askpass" "fansi" "utf8" "stringr"
## [41] "graphics" "grDevices" "sys" "fastmap" "stringi"
Happy to help work on this!
Best,
Mike
In particular, can we take out the dplyr / dbplyr dependencies?
> packageDependencies("dplyr")
[1] "cli" "generics" "glue" "lifecycle" "magrittr"
[6] "methods" "pillar" "R6" "rlang" "tibble"
[11] "tidyselect" "utils" "vctrs" "fansi" "utf8"
[16] "pkgconfig" "withr" "grDevices" "graphics" "stats"
> packageDependencies("dbplyr")
[1] "blob" "cli" "DBI" "dplyr" "glue"
[6] "lifecycle" "magrittr" "methods" "pillar" "purrr"
[11] "R6" "rlang" "tibble" "tidyr" "tidyselect"
[16] "utils" "vctrs" "withr" "generics" "fansi"
[21] "utf8" "pkgconfig" "stringr" "cpp11" "graphics"
[26] "grDevices" "stats" "stringi" "tools"
Here's current session info after only attaching BiocFileCache:
> library(BiocFileCache)
Loading required package: dbplyr
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.5.2
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] BiocFileCache_2.8.0 dbplyr_2.3.3 AcidDevTools_0.6.15
loaded via a namespace (and not attached):
[1] vctrs_0.6.3 httr_1.4.7 cli_3.6.1 rlang_1.1.1
[5] DBI_1.1.3 generics_0.1.3 glue_1.6.2 bit_4.0.5
[9] fansi_1.0.4 filelock_1.0.2 tibble_3.2.1 fastmap_1.1.1
[13] lifecycle_1.0.3 memoise_2.0.1 compiler_4.3.1 dplyr_1.1.3
[17] RSQLite_2.3.1 blob_1.2.4 pkgconfig_2.0.3 R6_2.5.1
[21] tidyselect_1.2.0 utf8_1.2.3 pillar_1.9.0 curl_5.0.2
[25] parallel_4.3.1 magrittr_2.0.3 tools_4.3.1 bit64_4.0.5
[29] cachem_1.0.8
>
the results and output are currently in dplyr tbl and thus use the functions accordingly to filter/mutate/summarize. It was a conscious effort to use more tidy like structures for the package.
The downside with that though is any other package that imports BiocFileCache requires all those additional dependencies, and it really starts to add up if you import any other informatics tools
I'm still not in favor of restructuring the entire package just for the sake of relieving the dependency. The tidy structures are stable and provide for efficient and condensed code.