malucalle/selbal

Error from plot.bal for selbal.cv and numeric response

Closed this issue · 8 comments

Hello,

I tried to use selbal.cv with a numeric response and got the following error after the CV run:

############################################################### 
 STARTING selbal.cv FUNCTION 
###############################################################

#-------------------------------------------------------------# 
# ZERO REPLACEMENT . . .


, . . . FINISHED. 
#-------------------------------------------------------------#

#-------------------------------------------------------------# 
# Starting the cross - validation procedure . . .

 . . . finished. 
#-------------------------------------------------------------# 
###############################################################

 The optimal number of variables is: 4 

Error in plot.bal(NUM, DEN, logc, y, covar, col = col, logit.acc) : 
  object 'ROC.plot' not found

This is how I call the function:

pdf('tmp.pdf')
result <- selbal::selbal.cv(
    x=t(feas),
    y=conds, # vector w/ numeric values
    n.fold=args$n_fold,
    n.iter=args$n_iter,
    zero.rep="bayes",
    seed=23
)
dev.off()

Used version (commit hash): 4a156de
Session information:

R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS/LAPACK: /home/vgalata/miniconda3/envs/resistaPD3/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] grid      parallel  tools     stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] gridExtra_2.3       doParallel_1.0.15   iterators_1.0.12   
 [4] foreach_1.4.7       pROC_1.15.3         Biobase_2.42.0     
 [7] BiocGenerics_0.28.0 zCompositions_1.3.2 miscF_0.1-4        
[10] R2jags_0.5-7        rjags_4-6           coda_0.19-3        
[13] truncnorm_1.0-8     NADA_1.6-1          survival_2.44-1.1  
[16] MASS_7.3-51.4       plyr_1.8.4          ggsci_2.9          
[19] caret_6.0-84        ggplot2_3.1.1       lattice_0.20-38    
[22] selbal_0.1.0        argparse_2.0.1      testit_0.9         

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2         lubridate_1.7.4    mvtnorm_1.0-11     class_7.3-15      
 [5] assertthat_0.2.1   ipred_0.9-9        R6_2.4.0           MatrixModels_0.4-1
 [9] stats4_3.5.1       pillar_1.4.2       rlang_0.4.0        lazyeval_0.2.2    
[13] data.table_1.12.2  SparseM_1.77       rpart_4.1-15       Matrix_1.2-17     
[17] labeling_0.3       splines_3.5.1      gower_0.2.1        stringr_1.4.0     
[21] munsell_0.5.0      compiler_3.5.1     pkgconfig_2.0.2    findpython_1.0.5  
[25] mcmc_0.9-6         nnet_7.3-12        tidyselect_0.2.5   tibble_2.1.3      
[29] prodlim_2018.04.18 R2WinBUGS_2.1-21   codetools_0.2-16   crayon_1.3.4      
[33] dplyr_0.8.3        withr_2.1.2        recipes_0.1.7      ModelMetrics_1.2.2
[37] nlme_3.1-141       jsonlite_1.6       gtable_0.3.0       magrittr_1.5      
[41] scales_1.0.0       stringi_1.4.3      reshape2_1.4.3     timeDate_3043.102 
[45] generics_0.0.2     boot_1.3-23        lava_1.6.6         glue_1.3.1        
[49] purrr_0.3.2        abind_1.4-5        colorspace_1.4-1   quantreg_5.51     
[53] MCMCpack_1.4-4

Thank you in advance!

Edit: Removed a commented out line.

Hi @VGalata!

ROC.plot function is used only when the response variable is dichotomous, so I was wondering if conds is really numeric. When you run class(conds) which is the output?

In addition I suggest you to run first selbal.cv() and then save the graphical representations you want.

Thank you for your comments

Hi @UVic-omics,

Thank you for your reply!

class(conds) returns "numeric". I tried my example again, making sure that I have a numeric vector, and got the same error as reported above.

The pdf(...) ... dev.off() around selbal.cv(...) is not used to save the plots. selbal_cv() tries to plot something or at least to open a display connection. But, I execute the code on a server and there R cannot open a pop-up window. Therefore, an error is thrown. To avoid that I open a PDF and close the connection after selbal_cv() is done. By the way, the PDF remains empty so I do not know why it attempts to plot something.

Ok, understood.

Could you run selbal.cv() and save the object? Or at least save the session with the objects? If you can do this, then you can access into the plot and save it into a PDF.
selbal.cv() is designed to return directly the result in a plot without the requirement of the user, so as you are running on a server, it may be the cause of the problem as you explain above.

Please, try to save the workspace instead of saving directly the pdf, and then try to save the result with result$global.plot.

Thank you again. I look forward to your answer

Hi @UVic-omics,

I already do what you suggested. I save the object returned by selbal.cv() and use it later to extract the required information and to save the plots. The problem it causes by trying to plot something by itself is something I can live with. :)

My current problem is that I cannot obtain results when using a numeric response variable because of the error described in the first post.

Neither if you run the example included in the package (the following one)?

library(selbal)

data(sCD14)

x <- sCD14[,-ncol(sCD14)]
y <- sCD14[, ncol(sCD14)]

result <- selbal.cv(x,y)

Tried:

library(selbal)
data(sCD14)
x <- sCD14[,-ncol(sCD14)]
y <- sCD14[, ncol(sCD14)]
cv.s <- selbal.cv(x,y)

Output:

############################################################### 
 STARTING selbal.cv FUNCTION 
###############################################################

#-------------------------------------------------------------# 
# ZERO REPLACEMENT . . .


, . . . FINISHED. 
#-------------------------------------------------------------#

#-------------------------------------------------------------# 
# Starting the cross - validation procedure . . .
 . . . finished. 
#-------------------------------------------------------------# 
###############################################################

 The optimal number of variables is: 4 

Error in plot.bal(NUM, DEN, logc, y, covar, col = col, logit.acc) : 
  object 'ROC.plot' not found

Session info:

R version 3.5.1 (2018-07-02)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS/LAPACK: /home/vgalata/miniconda3/envs/resistaPD3/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] grid      parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] gridExtra_2.3       ggplot2_3.1.1       doParallel_1.0.15  
 [4] iterators_1.0.12    foreach_1.4.7       pROC_1.15.3        
 [7] Biobase_2.42.0      BiocGenerics_0.28.0 zCompositions_1.3.2
[10] miscF_0.1-4         R2jags_0.5-7        rjags_4-6          
[13] coda_0.19-3         truncnorm_1.0-8     NADA_1.6-1         
[16] survival_2.44-1.1   MASS_7.3-51.4       plyr_1.8.4         
[19] selbal_0.1.0       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2         compiler_3.5.1     pillar_1.4.2       tools_3.5.1       
 [5] boot_1.3-23        tibble_2.1.3       gtable_0.3.0       lattice_0.20-38   
 [9] pkgconfig_2.0.2    rlang_0.4.0        Matrix_1.2-17      mvtnorm_1.0-11    
[13] SparseM_1.77       withr_2.1.2        dplyr_0.8.3        MatrixModels_0.4-1
[17] tidyselect_0.2.5   glue_1.3.1         R6_2.4.0           purrr_0.3.2       
[21] magrittr_1.5       scales_1.0.0       codetools_0.2-16   R2WinBUGS_2.1-21  
[25] mcmc_0.9-6         splines_3.5.1      assertthat_0.2.1   abind_1.4-5       
[29] colorspace_1.4-1   labeling_0.3       quantreg_5.51      MCMCpack_1.4-4    
[33] lazyeval_0.2.2     munsell_0.5.0      crayon_1.3.4   

Package description:

Package: selbal
Type: Package
Title: Finding highly - associated balances with the response variable.
Version: 0.1.0
Author: Javier Rivera-Pinto
Maintainer: A. Susin <toni.susin@upc.edu>
Description: it selects a balance associated to a given response variable and evaluates its robustness through a cross - validation procedure.
License: GPL-3
LazyData: TRUE
Imports: compositions, doParallel, foreach, ggplot2, grid, gridExtra,
        gtable, plyr, pROC, qdapRegex, zCompositions, CMA, phyloseq
biocViews:
RoxygenNote: 6.0.1
Suggests: knitr, rmarkdown
VignetteBuilder: knitr
RemoteType: github
RemoteHost: api.github.com
RemoteRepo: selbal
RemoteUsername: UVic-omics
RemoteRef: master
RemoteSha: 4a156debd91b0f8dee91de5bcea393e13c0147be
GithubRepo: selbal
GithubUsername: UVic-omics
GithubRef: master
GithubSHA1: 4a156debd91b0f8dee91de5bcea393e13c0147be
NeedsCompilation: no
Packaged: 2019-11-11 08:48:37 UTC; vgalata
Built: R 3.5.1; ; 2019-11-11 08:48:38 UTC; unix

Hi @VGalata
We have fixed this issue. Please, update the package and check that it works now properly.
Thanks for your comments!

It works now! Thank you!