For SMap, if dataframe doesn't have target values for pred indices, the class throws error.
Closed this issue · 1 comments
I am trying to use SMap for time series forecasting.
Basically, given an input dataframe.
if we have
lib = "startLib endLib" and pred="startPred endPred"
Then I get good results only if both these ranges of indices have proper values WITHIN the input dataframe.
But if I put the pred range beyond the input dataframe, it throws an error.
So I append a dummy dataframe with zero values at the end of the input dataframe.
such that the dummy dataframe has the same size in the pred ranges.
Then the predictions are horrible.
If I following the tutorial properly on my data, I get proper results.
In the tutorial, the dataframe has future values for target variables.
But if I put future values as NaN , it throws error.
If I put future values as 0.0, it produces junk results.
How should I prepare my dataframe properly so that forecasting can be achieved?
Please provide a minimal example of the problem, and, version information.
The state space library is created from lib
and predictions are made by finding nearest neighbors to each pred
point in the state space, Thus as you identified, lib
and pred
must have data.
Forecasting beyond pred
can be done with Tp > 0
. For example:
>>> import pyEDM
>>> pyEDM.__version__
'2.0.3'
>>> df = pyEDM.sampleData['Lorenz5D']
>>> df.shape
(1000, 6)
>>> df.iloc[-3:,:]
Time V1 V2 V3 V4 V5
997 59.85 0.5780 -1.6804 3.7693 8.3641 4.3277
998 59.90 -0.8845 -1.2133 3.3424 9.2297 2.8772
999 59.95 -1.4639 -0.8408 3.0409 9.6364 1.0788
>>> sm = pyEDM.SMap( dataFrame = df, columns = 'V1', target = 'V1',
lib = [1,500], pred = [997,1000], E = 5, Tp = 1, theta = 3. )
>>> sm['predictions']
Time Observations Predictions Pred_Variance
0 59.80 2.5553 NaN NaN
1 59.85 0.5780 0.539239 16.484263
2 59.90 -0.8845 -0.900529 15.487685
3 59.95 -1.4639 -1.457071 11.789297
4 60.00 NaN -1.109293 8.244481