Error in INTERNAL_EmbedDimension(pathIn, dataFile, dataFrame, pathOut, : FindNeighbors(): Library is too small to resolve 2 knn neighbors.
Closed this issue · 1 comments
desertnaut commented
I try to use the package to analyze & predict the sunspot monthly data in rolling windows. Here is the fully reproducible code:
library(rEDM)
df <- data.frame(yr = as.numeric(time(sunspot.month)),
sunspot_count = as.numeric(sunspot.month))
# make indices for 11 rolling splits
train_splits <- rep(NA, 11)
test_splits <- rep(NA, 11)
periods_train <- 12 * 50 # 50 yrs
periods_test <- 12 * 10 # 10 yrs
skip_span <- 12 * 20 # 20 yrs
for (k in 1:11) {
train_start <- (k-1)*skip_span + 1
train_stop <- train_start + periods_train -1
test_start <- train_stop + 1
test_stop <- test_start + periods_test -1
train_splits[k] <- paste(as.character(train_start), as.character(train_stop))
test_splits[k] <- paste(as.character(test_start), as.character(test_stop))
}
# END make indices
# Try embeddings & predictions
k = 1
E.opt = EmbedDimension( dataFrame = df, # input data
lib = train_splits[k], # portion of data to train
pred = test_splits[k], # portion of data to predict
columns = "sunspot_count",
target = "sunspot_count")
# works OK with k = 1-3
# for k > 3, fails with:
# Error in INTERNAL_EmbedDimension(pathIn, dataFile, dataFrame, pathOut, :
# FindNeighbors(): Library is too small to resolve 2 knn neighbors.
It works OK with k = 1, 2, 3
, but for larger values of k
(it goes up to 11), it fails with the subject error message.
I wonder, since the size of the library is the same for every split (600 data points), and it works OK with the first 3 splits, why is this happening?
I have tried it with the just released v1.2.0, as well as with the previous version 1.1.0 - same behavior.
Session info:
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rEDM_1.2.0
loaded via a namespace (and not attached):
[1] compiler_3.6.1 tools_3.6.1 Rcpp_1.0.3
desertnaut commented
Despite no response, checking with the latest rEDM version 1.3.7 shows that the issue is now resolved, so I'm closing this.