SugiharaLab/rEDM

time indices, concatenated blocks with model_output with stats_only=FALSE

Closed this issue · 1 comments

If I remember, there was briefly an argument for "short_output" that truncated model_output data.frame to just the pred set. Since that has been removed, it looks like the code is padding NAs to get NROW(model_output) = NROW(block), but not putting the NaNs in the right place and not giving them time indices.

block <- data.frame(time=1:10,x=sin((1:10)/pi),y=cos((1:10)/pi))
out <- block_lnlp(block,tp=2,columns=c("x","y"),target_column = "x",stats_only = FALSE)

out$model_output[[1]]

time obs pred pred_var
1 3 0.81627311 0.9655879 0.0003981692
2 4 0.95605566 0.9135897 0.0072468536
3 5 0.99978466 0.9284073 0.0024075621
4 6 0.94306673 0.8425109 0.0241078565
5 7 0.79160024 0.7335355 0.0534826032
6 8 0.56060280 0.5976110 0.0790172007
7 9 0.27328240 0.3439944 0.1140423509
8 10 -0.04149429 0.3950724 0.0318934212
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN

Additionally, if you give a split library this does not appear to deter block_lnlp from making predictions across the gaps.

out <- block_lnlp(block,lib=rbind(c(1,5),c(6,10)),tp=2,columns=c("x","y"),target_column = "x",stats_only = FALSE)
out$model_output[[1]]
time obs pred pred_var
1 3 0.81627311 0.9593595 0.003831564
2 4 0.95605566 0.8972787 0.011777949
3 5 0.99978466 0.8827621 0.014781495
4 6 0.94306673 0.8959425 0.031103864
5 7 0.79160024 0.5932581 0.058161169
6 8 0.56060280 0.2666997 0.076164443
7 9 0.27328240 0.2824588 0.104255915
8 10 -0.04149429 0.3657842 0.024814822
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN

Although if you do something similar with simplex() you get the correct breaks in the predictions corresponding to the breaks given in the library.

out_simplex <- simplex(block$x,lib=rbind(c(1,5),c(6,10)),E=1,tp=2,stats_only = FALSE)
out_simplex$model_output[[1]]
time obs pred pred_var
1 3 0.81627311 0.42321690 0.2476161338
2 4 0.95605566 -0.03897166 0.0007877017
3 5 0.99978466 0.27779014 0.0012748452
4 6 0.94306673 NaN NaN
5 7 0.79160024 NaN NaN
6 8 0.56060280 0.67176509 0.1307101216
7 9 0.27328240 0.99722448 0.0011178303
8 10 -0.04149429 0.95403246 0.0013772935
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN

[[Although the padding at the end of the time series is still wrong]].

Finally, just as a double check, if you put a break in the time-series, the NaNs are for that are correct.

block_broken <- data.frame(time=1:10,x=sin((1:10)/pi),y=cos((1:10)/pi))
block_broken[5,c('x','y')] <- NA
out_broken <- block_lnlp(block_broken,tp=2,columns=c("x","y"),target_column = "x",stats_only = FALSE)
out_broken$model_output[[1]]
time obs pred pred_var
1 3 0.81627311 0.9443627 0.003857110
2 4 0.95605566 0.8381385 0.006627352
3 5 NA 0.9284073 0.002407562
4 6 0.94306673 0.7074185 0.055627882
5 7 0.79160024 NaN NaN
6 8 0.56060280 0.3496210 0.111945403
7 9 0.27328240 0.3071343 0.114579057
8 10 -0.04149429 0.3781828 0.030487316
9 NaN NaN NaN NaN
10 NaN NaN NaN NaN

Thanks for illustrating these issues.

Since this was done on an old version (pre 1.7.5), and encapsulates multiple issues, I'm going to close it and reopen a new issue focusing on the library break predictiion output.

Please note it is recommended to use the new API, as it is consistent across implementations (C++, Python, R) and directly interfaces to cppEDM. Nonetheless, we certainly want to address the legacy (0.7.X) API compatibility.

I believe that the NaN alignment issue has been resolved. Here's output of the version 1.8

> block <- data.frame( time=1:10, x=sin((1:10)/pi), y=cos((1:10)/pi) )
> out <- block_lnlp( block, tp=2, columns=c("x","y"), target_column = "x", stats_only = FALSE )
> out $ model_output
   Index Observations Predictions Pred_Variance Const_Predictions
1      1      0.31296         NaN           NaN               NaN
2      2      0.59448         NaN           NaN               NaN
3      3      0.81627     0.96559     0.0003982           0.31296
4      4      0.95606     0.91359     0.0072469           0.59448
5      5      0.99978     0.92841     0.0024076           0.81627
6      6      0.94307     0.84251     0.0241079           0.95606
7      7      0.79160     0.79119     0.0389745           0.99978
8      8      0.56060     0.59761     0.0790172           0.94307
9      9      0.27328     0.34399     0.1140424           0.79160
10    10     -0.04149     0.39507     0.0318934           0.56060
11    11          NaN     0.09529     0.0413000           0.27328
12    12          NaN     0.17593     0.0557830          -0.04149