neurogenomics/MAGMA_Celltyping

*gsa.out: truncated `VARIABLE` names

Closed this issue · 1 comments

1. Bug description

For some reason, MAGMA decided to truncate the VARIABLE column in their gsa.out files. This screws up attempts to match up the celltype names with those in the CTD.

Not sure if this was always the case, or I'm just noticing now because I'm using some CTDs with long celltype names (e.g. HumanCellLandscape)

Console output

Screenshot 2023-04-13 at 12 53 53

Expected behaviour

MAGMA.Celltyping can read in the results files without hitting an error.

2. Reproducible example

Code

magma_dirs <- MAGMA.Celltyping::import_magma_files(ids = c("ieu-a-298"))

ctd <- MAGMA::get_ctd("ctd_HumanCellLandscape")
 res <- MAGMA.Celltyping::celltype_associations_pipeline(
    ctd = ctd, 
    ctd_name ="ctd_HumanCellLandscape", 
    ctd_species = "human",
    magma_dirs = magma_dirs,  
    run_linear = TRUE, 
    run_top10 = TRUE, 
    upstream_kb = 35, 
    downstream_kb = 10, 
    force_new = TRUE,
    save_dir=here::here("processed_data/MAGMA")) 

3. Session info

R Under development (unstable) (2023-03-02 r83926)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
  [1] fs_1.6.1                      matrixStats_0.63.0            bitops_1.0-7                  lubridate_1.9.2               devtools_2.4.5               
  [6] webshot_0.5.4                 RColorBrewer_1.1-3            httr_1.4.5                    doParallel_1.0.17             dynamicTreeCut_1.63-1        
 [11] gh_1.4.0                      numDeriv_2016.8-1.1           profvis_0.3.7                 tools_4.3.0                   MAGMA.Celltyping_2.0.9       
 [16] backports_1.4.1               utf8_1.2.3                    R6_2.5.1                      uwot_0.1.14                   lazyeval_0.2.2               
 [21] withr_2.5.0                   urlchecker_1.0.1              gridExtra_2.3                 prettyunits_1.1.1             preprocessCore_1.61.0        
 [26] WGCNA_1.72-1                  cli_3.6.1                     Biobase_2.59.0                TSP_1.2-4                     askpass_1.1                  
 [31] ewceData_1.7.1                Rsamtools_2.15.3              yulab.utils_0.0.6             foreign_0.8-84                R.utils_2.12.2               
 [36] sessioninfo_1.2.2             plotrix_3.8-2                 BSgenome_1.67.4               orthogene_1.5.3               maps_3.4.1                   
 [41] limma_3.55.7                  readxl_1.4.2                  impute_1.73.0                 rstudioapi_0.14               RSQLite_2.3.1                
 [46] optimParallel_1.0-2           generics_0.1.3                gridGraphics_0.5-1            BiocIO_1.9.2                  combinat_0.0-8               
 [51] dendextend_1.17.1             car_3.1-2                     dplyr_1.1.1                   homologene_1.4.68.19.3.27     GO.db_3.17.0                 
 [56] Matrix_1.5-3                  fansi_1.0.4                   S4Vectors_0.37.5              abind_1.4-5                   R.methodsS3_1.8.2            
 [61] lifecycle_1.0.3               scatterplot3d_0.3-43          yaml_2.3.7                    carData_3.0-5                 SummarizedExperiment_1.29.1  
 [66] clusterGeneration_1.3.7       BiocFileCache_2.7.2           grid_4.3.0                    blob_1.2.4                    promises_1.2.0.1             
 [71] ExperimentHub_2.7.1           crayon_1.5.2                  miniUI_0.1.1.1                lattice_0.20-45               GenomicFeatures_1.51.4       
 [76] KEGGREST_1.39.0               MungeSumstats_1.7.19          pillar_1.9.0                  knitr_1.42                    GenomicRanges_1.51.4         
 [81] rjson_0.2.21                  boot_1.3-28.1                 codetools_0.2-19              fastmatch_1.1-3               glue_1.6.2                   
 [86] ggfun_0.0.9                   data.table_1.14.8             remotes_2.4.2                 vctrs_0.6.1                   png_0.1-8                    
 [91] treeio_1.23.1                 cellranger_1.1.0              gtable_0.3.3                  assertthat_0.2.1              cachem_1.0.7                 
 [96] xfun_0.38                     mime_0.12                     coda_0.19-4                   survival_3.5-3                gargle_1.3.0                 
[101] seriation_1.4.2               SingleCellExperiment_1.21.1   RNOmni_1.0.1                  iterators_1.0.14              interactiveDisplayBase_1.37.0
[106] ellipsis_0.3.2                nlme_3.1-162                  ggtree_3.7.2                  EWCE_1.7.4                    usethis_2.1.6                
[111] bit64_4.0.5                   progress_1.2.2                filelock_1.0.2                googleAuthR_2.0.1             GenomeInfoDb_1.35.16         
[116] rprojroot_2.0.3               rpart_4.1.19                  Hmisc_5.0-1                   colorspace_2.1-0              BiocGenerics_0.45.3          
[121] DBI_1.1.3                     nnet_7.3-18                   phangorn_2.11.1               mnormt_2.1.1                  tidyselect_1.2.0             
[126] processx_3.8.0                bit_4.0.5                     compiler_4.3.0                curl_5.0.0                    httr2_0.2.2                  
[131] htmlTable_2.4.1               expm_0.999-7                  xml2_1.3.3                    ggdendro_0.1.23               DelayedArray_0.25.0          
[136] plotly_4.10.1                 rtracklayer_1.59.1            checkmate_2.1.0               scales_1.2.1                  quadprog_1.5-8               
[141] callr_3.7.3                   rappdirs_0.3.3                stringr_1.5.0                 digest_0.6.31                 piggyback_0.1.4              
[146] minqa_1.2.5                   rmarkdown_2.21                ca_0.71.1                     XVector_0.39.0                base64enc_0.1-3              
[151] htmltools_0.5.5               pkgconfig_2.0.3               lme4_1.1-32                   MatrixGenerics_1.11.1         dbplyr_2.3.2                 
[156] fastmap_1.1.1                 rlang_1.1.0                   htmlwidgets_1.6.2             shiny_1.7.4                   jsonlite_1.8.4               
[161] BiocParallel_1.33.12          R.oo_1.25.0                   VariantAnnotation_1.45.1      RCurl_1.98-1.12               magrittr_2.0.3               
[166] Formula_1.2-5                 GenomeInfoDbData_1.2.10       ggplotify_0.1.0               patchwork_1.1.2               munsell_0.5.0                
[171] Rcpp_1.0.10                   viridis_0.6.2                 ape_5.7-1                     babelgene_22.9                stringi_1.7.12               
[176] zlibbioc_1.45.0               MASS_7.3-58.3                 AnnotationHub_3.7.4           plyr_1.8.8                    pkgbuild_1.4.0               
[181] parallel_4.3.0                Biostrings_2.67.2             splines_4.3.0                 hms_1.1.3                     ps_1.7.4                     
[186] fastcluster_1.2.3             igraph_1.4.2                  ggpubr_0.6.0                  ggsignif_0.6.4                reshape2_1.4.4               
[191] biomaRt_2.55.4                stats4_4.3.0                  pkgload_1.3.2                 gprofiler2_0.2.1              BiocVersion_3.17.1           
[196] XML_3.99-0.14                 evaluate_0.20                 BiocManager_1.30.20           nloptr_2.0.3                  foreach_1.5.2                
[201] httpuv_1.6.9                  openssl_2.0.6                 grr_0.9.5                     tidyr_1.3.0                   purrr_1.0.1                  
[206] heatmaply_1.4.2               ggplot2_3.4.2                 broom_1.0.4                   xtable_1.8-4                  restfulr_0.0.15              
[211] gitcreds_0.1.2                phytools_1.5-1                tidytree_0.4.2                rstatix_0.7.2                 later_1.3.0                  
[216] viridisLite_0.4.1             googledrive_2.1.0             tibble_3.2.1                  aplot_0.1.10                  registry_0.5-1               
[221] memoise_2.0.1                 AnnotationDbi_1.61.2          GenomicAlignments_1.35.1      IRanges_2.33.1                cluster_2.1.4                
[226] HGNChelper_0.8.1              timechange_0.2.0              here_1.0.1   

Looks like I actually already accounted for this in the past:

if (!is.null(res$FULL_NAME)) {

Bizarrely, the FULL_NAME column isn't even consistently generated. So I had to make this conditional:
Screenshot 2023-04-14 at 12 14 39