Easy way to link signature names back to the data.frame imported with importBugSigDB
Closed this issue · 3 comments
sdgamboa commented
I think it could be helpful to have an 'id' or 'index' column in the data.frame imported by importBugSigDB
so that it would be easier to link the signature names back to the data.frame. A toy example below (length would be other helpful information like scores, p-values, etc.).
library(bugsigdbr)
bsdb <- importBugSigDB()
#> Using cached version from 2022-08-17 13:43:05
my_sigs <- getSignatures(df = bsdb, tax.id.type = 'ncbi', tax.level = 'mixed')
nrow(bsdb)
#> [1] 2270
length(my_sigs)
#> [1] 2270
x <- lapply(my_sigs, function(x) data.frame(length = length(x)))
df <- do.call(rbind, x)
df$sig_name <- rownames(df)
df$id <- sub('_.*$', '', df$sig_name)
rownames(df) <- NULL
head(df[,c('id', 'length')])
#> id length
#> 1 bsdb:1/1/1 20
#> 2 bsdb:1/1/2 2
#> 3 bsdb:1/2/1 2
#> 4 bsdb:1/2/2 3
#> 5 bsdb:1/3/1 2
#> 6 bsdb:1/4/1 24
id1 <- sub(".* ", "", bsdb$Experiment)
id2 <- sub(".* ", "", bsdb$Study)
id3 <- sub(".* ", "", bsdb$`Signature page name`)
id <- paste0('bsdb:', id1, '/', id2, '/', id3)
mean(duplicated(id))
#> [1] 0
bsdb$id <- id
merged_df <- merge(df, bsdb, by = 'id')
merged_df[,c('Experiment', 'Study', 'Signature page name', 'id', 'length')] |>
head()
#> Experiment Study Signature page name id length
#> 1 Experiment 1 Study 1 Signature 1 bsdb:1/1/1 20
#> 2 Experiment 1 Study 1 Signature 2 bsdb:1/1/2 2
#> 3 Experiment 1 Study 2 Signature 1 bsdb:1/2/1 2
#> 4 Experiment 1 Study 2 Signature 2 bsdb:1/2/2 3
#> 5 Experiment 1 Study 3 Signature 1 bsdb:1/3/1 2
#> 6 Experiment 1 Study 4 Signature 1 bsdb:1/4/1 24
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R Under development (unstable) (2022-12-25 r83502)
#> os Pop!_OS 22.04 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2023-01-29
#> pandoc 2.19.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.3.0)
#> BiocFileCache 2.7.1 2022-12-09 [1] Bioconductor
#> bit 4.0.5 2022-11-15 [2] CRAN (R 4.3.0)
#> bit64 4.0.5 2020-08-30 [2] CRAN (R 4.3.0)
#> blob 1.2.3 2022-04-10 [2] CRAN (R 4.3.0)
#> bugsigdbr * 1.5.2 2022-11-24 [1] Bioconductor
#> cachem 1.0.6 2021-08-19 [2] CRAN (R 4.3.0)
#> cli 3.6.0 2023-01-09 [1] CRAN (R 4.3.0)
#> curl 5.0.0 2023-01-12 [2] CRAN (R 4.3.0)
#> DBI 1.1.3 2022-06-18 [2] CRAN (R 4.3.0)
#> dbplyr 2.3.0 2023-01-16 [2] CRAN (R 4.3.0)
#> digest 0.6.31 2022-12-11 [2] CRAN (R 4.3.0)
#> dplyr 1.0.10 2022-09-01 [2] CRAN (R 4.3.0)
#> evaluate 0.20 2023-01-17 [2] CRAN (R 4.3.0)
#> fansi 1.0.4 2023-01-22 [2] CRAN (R 4.3.0)
#> fastmap 1.1.0 2021-01-25 [2] CRAN (R 4.3.0)
#> filelock 1.0.2 2018-10-05 [1] CRAN (R 4.3.0)
#> fs 1.6.0 2023-01-23 [2] CRAN (R 4.3.0)
#> generics 0.1.3 2022-07-05 [2] CRAN (R 4.3.0)
#> glue 1.6.2 2022-02-24 [2] CRAN (R 4.3.0)
#> htmltools 0.5.4 2022-12-07 [2] CRAN (R 4.3.0)
#> httr 1.4.4 2022-08-17 [2] CRAN (R 4.3.0)
#> knitr 1.42 2023-01-25 [2] CRAN (R 4.3.0)
#> lifecycle 1.0.3 2022-10-07 [2] CRAN (R 4.3.0)
#> magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.3.0)
#> memoise 2.0.1 2021-11-26 [2] CRAN (R 4.3.0)
#> pillar 1.8.1 2022-08-19 [2] CRAN (R 4.3.0)
#> pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.3.0)
#> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.3.0)
#> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0)
#> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.0)
#> R6 2.5.1 2021-08-19 [2] CRAN (R 4.3.0)
#> Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.3.0)
#> reprex 2.0.2 2022-08-17 [2] CRAN (R 4.3.0)
#> rlang 1.0.6 2022-09-24 [2] CRAN (R 4.3.0)
#> rmarkdown 2.20 2023-01-19 [2] CRAN (R 4.3.0)
#> RSQLite 2.2.20 2022-12-22 [1] CRAN (R 4.3.0)
#> rstudioapi 0.14 2022-08-22 [2] CRAN (R 4.3.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
#> styler 1.9.0 2023-01-15 [1] CRAN (R 4.3.0)
#> tibble 3.1.8 2022-07-22 [2] CRAN (R 4.3.0)
#> tidyselect 1.2.0 2022-10-10 [2] CRAN (R 4.3.0)
#> utf8 1.2.2 2021-07-24 [2] CRAN (R 4.3.0)
#> vctrs 0.5.2 2023-01-23 [2] CRAN (R 4.3.0)
#> withr 2.5.0 2022-03-03 [2] CRAN (R 4.3.0)
#> xfun 0.36 2022-12-21 [2] CRAN (R 4.3.0)
#> yaml 2.3.7 2023-01-23 [2] CRAN (R 4.3.0)
#>
#> [1] /home/samuel/R/x86_64-pc-linux-gnu-library/4.3
#> [2] /home/samuel/Apps/R-devel/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
Created on 2023-01-29 with reprex v2.0.2
sdgamboa commented
Actually, I think it's Study/Experiment/Study page name.
lgeistlinger commented
Thanks @sdgamboa. It would make sense to incorporate this directly in the export. I am transferring to BugSigDBExports.
lgeistlinger commented
Incorporated via 428eb49.