fhollenbach/EBMA

Current check for model weights summing to 1 suffers from floating point imprecision

Closed this issue · 1 comments

if(sum(W[i,]) != 1){

The quoted code sum(W[1,]) == 1 suffers from numerical imprecision if the number of models is sufficiently large. On my Mac M3 this check fails on a structure with M=10 models. I suggest to replace the call with
all.equal(sum(W[1,]), 1), which is more tolerant to floating point imprecision.

I did not check it, but the problem might occur in other places.

P.S. Reproducible example (on Apple M3, might not occur on other machines):


> 
> 
> # Minimal reproducible example to show problem on Apple M3
> library(EBMAforecast)
> set.seed(1)
> pred_train <- matrix( rnorm(900), ncol=10)
> pred_test  <- matrix( rnorm(100), ncol=10)
> truth_train <- apply(pred_train,1, mean) + rnorm(nrow(pred_train), sd=0.1)
> truth_test <- apply(pred_train,1, mean) + rnorm(nrow(pred_test), sd=0.1)
> myForecastData <- makeForecastData(
+   .predCalibration = pred_train,
+   .outcomeCalibration = truth_train,
+   .predTest = pred_test,
+   .outcomeTest = truth_test
+ )
> calibrateEnsemble(myForecastData, model="normal")
Model weights estimated using EM algorithmError in .local(.forecastData, tol, maxIter, method, exp, useModelParams,  : 
  Vector of initial model weights must sum to 1.
> # Problem is in the comparison
> M <- dim(myForecastData@predCalibration)[2]
> W <- matrix(rep(1/M, M, collapse=FALSE), ncol=M)
> W
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]  0.1  0.1  0.1  0.1  0.1  0.1  0.1  0.1  0.1   0.1
> dim(W)[2] != M
[1] FALSE
> sum(W[1,]) == 1
[1] FALSE
> all.equal(sum(W[1,]), 1)
[1] TRUE
> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] EBMAforecast_1.0.31

loaded via a namespace (and not attached):
 [1] utf8_1.2.4         generics_0.1.3     gtools_3.9.5       separationplot_1.4
 [5] stringi_1.8.3      digest_0.6.34      magrittr_2.0.3     RColorBrewer_1.1-3
 [9] evaluate_0.23      grid_4.3.2         fastmap_1.1.1      plyr_1.8.9        
[13] nnet_7.3-19        backports_1.4.1    Formula_1.2-5      gridExtra_2.3     
[17] purrr_1.0.2        fansi_1.0.6        scales_1.3.0       abind_1.4-5       
[21] cli_3.6.2          rlang_1.1.3        munsell_0.5.0      Hmisc_5.1-1       
[25] base64enc_0.1-3    yaml_2.3.8         tools_4.3.2        checkmate_2.3.1   
[29] htmlTable_2.4.2    dplyr_1.1.4        colorspace_2.1-0   ggplot2_3.5.0     
[33] vctrs_0.6.5        R6_2.5.1           rpart_4.1.21       lifecycle_1.0.4   
[37] stringr_1.5.1      htmlwidgets_1.6.4  MASS_7.3-60        foreign_0.8-85    
[41] cluster_2.1.4      pkgconfig_2.0.3    pillar_1.9.0       gtable_0.3.4      
[45] glue_1.7.0         data.table_1.15.0  Rcpp_1.0.12        xfun_0.42         
[49] tibble_3.2.1       tidyselect_1.2.0   rstudioapi_0.15.0  knitr_1.45        
[53] htmltools_0.5.7    rmarkdown_2.25     compiler_4.3.2    
> 

Great, thanks for providing free software!