Current check for model weights summing to 1 suffers from floating point imprecision
Closed this issue · 1 comments
mhoehle commented
EBMA/EBMAforecast/R/fitEnsembleNormal.R
Line 47 in 2961a45
The quoted code sum(W[1,]) == 1
suffers from numerical imprecision if the number of models is sufficiently large. On my Mac M3 this check fails on a structure with M=10 models. I suggest to replace the call with
all.equal(sum(W[1,]), 1)
, which is more tolerant to floating point imprecision.
I did not check it, but the problem might occur in other places.
P.S. Reproducible example (on Apple M3, might not occur on other machines):
>
>
> # Minimal reproducible example to show problem on Apple M3
> library(EBMAforecast)
> set.seed(1)
> pred_train <- matrix( rnorm(900), ncol=10)
> pred_test <- matrix( rnorm(100), ncol=10)
> truth_train <- apply(pred_train,1, mean) + rnorm(nrow(pred_train), sd=0.1)
> truth_test <- apply(pred_train,1, mean) + rnorm(nrow(pred_test), sd=0.1)
> myForecastData <- makeForecastData(
+ .predCalibration = pred_train,
+ .outcomeCalibration = truth_train,
+ .predTest = pred_test,
+ .outcomeTest = truth_test
+ )
> calibrateEnsemble(myForecastData, model="normal")
Model weights estimated using EM algorithmError in .local(.forecastData, tol, maxIter, method, exp, useModelParams, :
Vector of initial model weights must sum to 1.
> # Problem is in the comparison
> M <- dim(myForecastData@predCalibration)[2]
> W <- matrix(rep(1/M, M, collapse=FALSE), ncol=M)
> W
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
> dim(W)[2] != M
[1] FALSE
> sum(W[1,]) == 1
[1] FALSE
> all.equal(sum(W[1,]), 1)
[1] TRUE
> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Berlin
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] EBMAforecast_1.0.31
loaded via a namespace (and not attached):
[1] utf8_1.2.4 generics_0.1.3 gtools_3.9.5 separationplot_1.4
[5] stringi_1.8.3 digest_0.6.34 magrittr_2.0.3 RColorBrewer_1.1-3
[9] evaluate_0.23 grid_4.3.2 fastmap_1.1.1 plyr_1.8.9
[13] nnet_7.3-19 backports_1.4.1 Formula_1.2-5 gridExtra_2.3
[17] purrr_1.0.2 fansi_1.0.6 scales_1.3.0 abind_1.4-5
[21] cli_3.6.2 rlang_1.1.3 munsell_0.5.0 Hmisc_5.1-1
[25] base64enc_0.1-3 yaml_2.3.8 tools_4.3.2 checkmate_2.3.1
[29] htmlTable_2.4.2 dplyr_1.1.4 colorspace_2.1-0 ggplot2_3.5.0
[33] vctrs_0.6.5 R6_2.5.1 rpart_4.1.21 lifecycle_1.0.4
[37] stringr_1.5.1 htmlwidgets_1.6.4 MASS_7.3-60 foreign_0.8-85
[41] cluster_2.1.4 pkgconfig_2.0.3 pillar_1.9.0 gtable_0.3.4
[45] glue_1.7.0 data.table_1.15.0 Rcpp_1.0.12 xfun_0.42
[49] tibble_3.2.1 tidyselect_1.2.0 rstudioapi_0.15.0 knitr_1.45
[53] htmltools_0.5.7 rmarkdown_2.25 compiler_4.3.2
>
mhoehle commented
Great, thanks for providing free software!