cole-trapnell-lab/garnett

garnett_check_markers not working for Fly

Closed this issue · 4 comments

Describe the bug
Hi,
I'm trying to use Garnett for Fly dataset. Unfortunately, check_markers is failing because it does not allow for markers to have special characters while in Fly we do find some that have: e.g.: E(spl)m8-HLH

To Reproduce

marker_check <- check_markers(cds = cds, 
                             "Garnett_markers.txt",
                             db = get(x = "org.Dm.eg.db"), 
                             cds_gene_id_type = "SYMBOL",
                             marker_file_gene_id_type = "SYMBOL", 
                             propogate_markers = TRUE,
                             use_tf_idf = TRUE,
                             classifier_gene_id_type = "SYMBOL")

Expected behavior
/

Screenshots
Here is a screenshot of the bug I get:

image

Additional context
Here is my R session info:

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: 

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] splines   stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Seurat_3.1.4                garnett_0.1.14              monocle_2.14.0              DDRTree_0.1.5              
 [5] irlba_2.3.3                 VGAM_1.1-2                  ggplot2_3.2.1               Matrix_1.2-18              
 [9] org.Dm.eg.db_3.10.0         AnnotationDbi_1.48.0        monocle3_0.2.1              SingleCellExperiment_1.8.0 
[13] SummarizedExperiment_1.16.1 DelayedArray_0.12.2         BiocParallel_1.20.1         matrixStats_0.55.0         
[17] GenomicRanges_1.38.0        GenomeInfoDb_1.22.0         IRanges_2.20.2              S4Vectors_0.24.3           
[21] Biobase_2.46.0              BiocGenerics_0.32.0

Hi @dweemx ,

Can you give the new branch I've just made a try with your data? You can install using

devtools::install_github("cole-trapnell-lab/garnett", ref="fly_genes")

If it works, then I'll merge it back into master - I don't have any example fly data to play with!

Thanks, I'll give it a try and let you know

Just tested your patch version fly_genes and its working ! 👍

Great, thanks for testing! Just pulled to master!