Changing Tp in Smap/Simplex results in a time-shift effect?
Closed this issue · 2 comments
When I use my own data to try Smap with Tp = 1, I found that directly plotting Smap output of Observations and Predictions will show two similar traces with a clear time shift.
Therefore, I went back to check with rEDM example dataset and found similar time-shifting effect.
Here is my code of Smap and plotting for 3 figures below.(only change Tp for different figures)
smap = SMap( dataFrame = df, lib = "1 287", pred = "1 287",
columns = "sunspot_count", target = "sunspot_count",
E = 5 , theta = 2,
Tp = 1 )
plot( df$yr[1:100], df$sunspot_count[1:100], type = "p", xlab = "year", ylab = "sunspots")
lines( df$yr[1:100], df$sunspot_count[1:100], col = "black", lwd = 2)
lines( smap$predictions$yr[1:100], smap$predictions$Predictions[1:100], col = "red")
My understanding of Tp is that it uses x(t) to predict x(t+Tp). So generally with larger Tp(if the system only has a slight memory), the prediction would be worse, but not time shifted, right?
Or in the output Smap table like below(Tp=1 case), the predicted value(Predictions: 72.286) in the row of yr=1706 should correspond to the observation(Observations:58) of row with yr=1705?(If this is the case, then we need to plot the observation points and prediction points with obs(t) and pred(t-1))
Thanks for the comments.
Re: My understanding of Tp is that it uses x(t) to predict x(t+Tp). So generally with larger Tp (if the system only has a slight memory), the prediction would be worse, but not time shifted, right?
Agreed. Well, a bit more specifically the state space embedding created from x(t) is used to predict x(t+Tp). If the system is linear or heavily autocorrelated then as you note, prediction may not degrade with a linear projection.
The data/analysis agree with this:
> df = read.csv( 'Yearly_1700-2009.csv' )
> head(df)
Year Sunspot
1 1700 5
2 1701 11
3 1702 16
>smap = SMap( dataFrame = df, lib = "1 310", pred = "1 310", columns = 'Sunspot', target = 'Sunspot', E = 5, theta = 2, Tp = 1 )
>pred = smap $ predictions
> smap4 = SMap( dataFrame = df, lib = "1 310", pred = "1 310", columns = 'Sunspot', target = 'Sunspot', E = 5, theta = 2, Tp = 4 )
> pred4 = smap4 $ predictions
> unlist( ComputeError( pred4$Observations, pred4$Predictions ) )
MAE rho RMSE
21.7855 0.6739 30.0556
> unlist( ComputeError( pred $ Observations, pred $ Predictions ) )
MAE rho RMSE
10.4504 0.9439 13.3845
We can see the output predictions are aligned with the observations since Tp initial values are missing, the code explicitly address this:
> head(pred,5)
Year Observations Predictions Pred_Variance
1 1704 36 NaN NaN
2 1705 58 44.99 1352
3 1706 29 72.23 1744
4 1707 20 12.09 2585
5 1708 10 10.55 2061
> head(pred4,10)
Year Observations Predictions Pred_Variance
1 1704 36 NaN NaN
2 1705 58 NaN NaN
3 1706 29 NaN NaN
4 1707 20 NaN NaN
5 1708 10 37.412 1634
6 1709 8 43.368 1386
7 1710 3 8.128 3260
8 1711 0 15.403 3215
9 1712 0 47.295 2099
10 1713 2 87.191 2321
When I plot the results they agree with yours.
Perhaps the appearance of a "shift" is somewhat perceptual. I too have noticed this in periodic data. The Tp=4 results seems to show a negative shift in peaks prior to 1770, positive shift afterward, with no apparent shift in the lower extrema. EDM is simply using the nearest neighbors found from the embedding of x(t) to predict Tp timesteps ahead. Perceived shifts are potentially biased by experience/perception. Not to say that periodic systems do not exhibit time delayed/advanced dynamics... another topic.
Thank you for the explanation!
It's great my running of smap is the same as yours.
I then guess some part of the seemingly shifted prediction of my dataset is the property of the data itself.