/TimeSeriesRegressor

A wrapper estimator that transforms any sklearn regressor into a time series predictor or sequence to sequence mapper

Primary LanguageJupyter Notebook

TimeSeriesRegressor

A wrapper estimator that transforms any sklearn regressor into a time series predictor or sequence to sequence mapper. The TSR internally transforms a regular dataset where the rows correspond to terms of a sequence into a sequence prediction dataset and learns a sequence to sequence predictor.

Requires

Numpy, Pandas, SciKit-Learn,pickle

Usage

To make a predictor of the stock market that maps the previous two days of the s&p500 stock prices and predicts the next day's price of AAPL stock try the following:

from TimeSeriesEstimator import TimeSeriesRegressor, time_series_split
from sklearn.linear_model import LinearRegression,Lasso
from utils import datasets


X = datasets('sp500')
y = X['AAPL']
X_train, X_test = time_series_split(X)
y_train, y_test = time_series_split(y)


n_prev=2
tsr = TimeSeriesRegressor(Lasso(), n_prev=n_prev)
tsr.fit(X_train, y_train)
pred_train = tsr.predict(X_train) #outputs a numpy array of length: len(X_train)-n_prev
pred_test = tsr.predict(X_test)

To forecast all stocks in the s&p500 100 days into the future use the forecast method:

tsr = TimeSeriesRegressor(LinearRegression(), n_prev=2)
tsr.fit(X_train)
fc = tsr.forecast(X_train, 100)

See the ipython notebook for a longer interactive example!

Install

Clone this repo and call directly as a module. Have not added automatic install support yet.

##Mechanics

The TSR works by taking in a single (X) or two datasets (X,Y) of equal length. In the single dataset case, the TSR assumes you would like to predict the next element in the dataset using the previous elements. In either case it forms a dataset by taking the previous n timesteps and flattening them into a vector.

Dataset X
Feature 1 Feature 2
1 1.5
2 2.5
3 3.5
4 4.5
5 5.5
New X with n_prev = 2
Feature 1 Feature 2 Feature 3 Feature 4
1 1.5 2 2.5
2 2.5 3 3.5
3 3.5 4 4.5
New Y with n_prev = 2
Feature 1 Feature 2
3 3.5
4 4.5
5 5.5

Now the X and Y datasets can be fit by any regression technique in sklearn. If the technique cannot handle vectors as outputs, use the "parallel_models" input to predict each feature sequence with its own multi to single dim regressor.