gaow/neuro-twas

Problem with the flash() function in the flash_no_constraint pipeline step.

Closed this issue · 9 comments

The current flash_greedy_workhorse by default use four stopping rules: c("objective", "loadings", "factors", "all_params"). Among them, the objective rules would required an additional unknown parameter, otherwise the following error will occurs:

Error in (function (f, p, ..., hessian = FALSE, typsize = rep(1, length(p)),  : 
  missing value in parameter

Currently, neither flash nor flash_add_greedy function has the capacity to encode this additional parameter, neither can they specify not to use the objective stopping rules.

This issue cause the flash_no_constraint step to break, a current solution is to used the existed customized flash_pipeline() , which is a customized version of the flash function, to handle the flash_no_constraint operation.

It occurs that the positive constraint is imposed by the init_fn = "my_init_fn" option, by specifying init_fn = "udv_si" the default flash shall run without the constraint. Trying it on our data, the two options produce identical result, this could be due to the constraint did not affect our calculation in anyway.

gaow commented

@hsun3163 could you upload here an example data-set and the code you use to reproduce the error?

@hsun3163 could you upload here an example data-set and the code you use to reproduce the error?

After a more extensive testing, it occurs that by incorporating the parameter

ebnm_param = list(l = list(mixcompdist = "normal",
                           optmethod = "mixSQP"),
                  f = list(mixcompdist = "+uniform",
                           optmethod = "mixSQP"))
ebnm_fn = "ebnm_ash"

into flash(f.d, greedy=TRUE, backfit = T,ebnm_fn = ebnm_fn,ebnm_param = ebnm_param) shall remove the errors. The details are demonstrated below.

library("flashr")
# Set parameter
ebnm_param = list(l = list(mixcompdist = "normal",
                           optmethod = "mixSQP"),
                  f = list(mixcompdist = "+uniform",
                           optmethod = "mixSQP"))
ebnm_fn = "ebnm_ash"

# Load data
dat = readRDS("./geneTpmResidualsAgeGenderAdj_rename.rds")
f.d = flash_set_data(as.matrix(dat$strong.z))
# For flash
### Following produce error
flash(f.d, greedy=TRUE, backfit = T)
### Following produce rresult
flash(f.d, greedy=TRUE, backfit = T,ebnm_fn = ebnm_fn,ebnm_param = ebnm_param)

# For flash_add_greedy
### Following produce error
flash_add_greedy(f.d)
### Following produce result
flash_add_greedy(f.d,ebnm_fn = "ebnm_ash",ebnm_param = ebnm_param)

# For flash_greedy_workhorse
### Following produce error
flashr:::flash_greedy_workhorse(f.d)
flashr:::flash_greedy_workhorse(f.d,stopping_rule = "objective")

### Following produce result
flashr:::flash_greedy_workhorse(f.d,stopping_rule = "factor")
flashr:::flash_greedy_workhorse(f.d,ebnm_fn = "ebnm_ash",ebnm_param = ebnm_param)

The testing data can be download from https://www.synapse.org/#!Synapse:syn24464832

If removing the ebnm_param but keep the ebnm_fn = "ebnm_ash", i.e. flash(f.d, greedy=TRUE, backfit = T,ebnm_fn = "ebnm_ash" )
Following warning will always be produced:

Warning message:
In verbose_obj_decrease_warning() :
 An iteration decreased the objective. This happens occasionally, perhaps due to numeric reasons. You could ignore this warning, but you might like to check out https://github.com/stephenslab/flashr/issues/26 for more details.

Otherwise, if removing the ebnm_fn but keep the ebnm_param = ebnm_param, i.e. flash(f.d, greedy=TRUE, backfit = T,ebnm_param = ebnm_param )

The following errors will be produce:

Error in (function (x, s = 1, mode = 0, scale = "estimate", g_init = NULL,  : 
  unused argument (mixcompdist = "normal")

Which is likely due to the default ebnm_fn in flash , i.e. ebnm_pn, not having the argument mixcompdist

gaow commented

@hsun3163 could you share you sessionInfo()? I cannot reproduce it:

> dat = readRDS('~/Downloads/geneTpmResidualsAgeGenderAdj_rename.rds')
> library("flashr")
> f.d = flash_set_data(as.matrix(dat$strong.z))
> flash(f.d, greedy=TRUE, backfit = T)
Fitting factor/loading 1 (stop when difference in obj. is < 1.00e-02):
  Iteration      Objective   Obj Diff
          1         -72.08        Inf
          2         -70.18   1.90e+00
          3         -69.81   3.68e-01
          4         -69.72   9.57e-02
          5         -69.68   3.14e-02
          6         -69.67   1.19e-02
          7         -69.67   4.68e-03
Performing nullcheck...
  Deleting factor 1 decreases objective by 4.69e+00. Factor retained.
  Nullcheck complete. Objective: -69.67
Fitting factor/loading 2 (stop when difference in obj. is < 1.00e-02):
  Iteration      Objective   Obj Diff
          1         -77.33        Inf
          2         -74.34   2.99e+00
          3         -71.30   3.04e+00
          4         -70.53   7.70e-01
          5         -70.53   0.00e+00
Performing nullcheck...
  Deleting factor 2 increases objective by 8.66e-01. Factor zeroed out.
  Nullcheck complete. Objective: -69.67
Backfitting 1 factor/loading(s) (stop when difference in obj. is < 1.00e-02):
  Iteration      Objective   Obj Diff
          1         -69.67        Inf
          2         -69.67   7.14e-04
Performing nullcheck...
  Deleting factor 1 decreases objective by 4.69e+00. Factor retained.
  Nullcheck complete. Objective: -69.67
Summary of flash object:
  Number of factor/loading pairs: 1
  Proportion of variance explained:
    Factor/loading 1: 0.722
  Value of objective function: -69.665
other attached packages:
[1] flashr_0.6-7

My SessionInfo is as followed, my code is ran within the twas_latest.sif singularity image at
/mnt/mfs/statgen/containers/twas_latest.sif

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] flashr_0.6-7

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6       magrittr_2.0.1   tidyselect_1.1.0 munsell_0.5.0   
 [5] SQUAREM_2021.1   colorspace_2.0-0 lattice_0.20-38  R6_2.5.0        
 [9] rlang_0.4.10     stringr_1.4.0    plyr_1.8.6       ashr_2.2-51     
[13] dplyr_1.0.3      tools_3.6.2      grid_3.6.2       ebnm_0.1-32     
[17] gtable_0.3.0     irlba_2.3.3      invgamma_1.1     ellipsis_0.3.1  
[21] softImpute_1.4   tibble_3.0.5     lifecycle_0.2.0  crayon_1.3.4    
[25] mixsqp_0.3-43    Matrix_1.2-18    reshape2_1.4.4   purrr_0.3.3     
[29] ggplot2_3.3.3    vctrs_0.3.6      trust_0.1-8      glue_1.4.2      
[33] stringi_1.5.3    compiler_3.6.2   pillar_1.4.7     generics_0.1.0  
[37] scales_1.1.1     truncnorm_1.0-8  pkgconfig_2.0.3 
gaow commented

@hsun3163 could you check if the current master on github for flashr works? You can install flashr on your laptop and check it.

@hsun3163 could you check if the current master on github for flashr works? You can install flashr on your laptop and check it.

On the laptop it actually works, but they are in fact the same version.

gaow commented

Which is likely due to the default ebnm_fn in flash , i.e. ebnm_pn, not having the argument mixcompdist

Correct. this is specific to ebnm_ash.

gaow commented

Error reported to:

stephenslab/ebnm#46

This version of ebnm works:

devtools::install_github('stephenslab/ebnm', ref='b56ad3ec0b302d9760b99a20feb3e925acd4983d')

Let's wait for them to fix it before using ebnm master branch