cerlymarco/MEDIUM_NoteBook

Multivariate anamoly detection in time series VAR

LucyGarlapati opened this issue · 1 comments

How to split train and test for multi variate analysis. I have a date column and 4 features.

U can use train_test_split from sklearn with shuffle=False

import numpy as np
from sklearn.model_selection import train_test_split

X = np.asarray([np.arange(10)]*4).T # (n_sample, n_feat)

X_train, X_test = train_test_split(X, test_size=0.2, shuffle=False)
X_train, X_test

output:

(array([[0, 0, 0, 0],
        [1, 1, 1, 1],
        [2, 2, 2, 2],
        [3, 3, 3, 3],
        [4, 4, 4, 4],
        [5, 5, 5, 5],
        [6, 6, 6, 6],
        [7, 7, 7, 7]]),
 array([[8, 8, 8, 8],
        [9, 9, 9, 9]]))