philipperemy/keras-tcn

How to use TCN for multidimensional time series prediction?

dingfengqian opened this issue · 6 comments

Hi, Thanks very much for your open codes. I have a quenstion about the prediction for multidimensional data.
If the input for TCN is set (None, 10, 5), it means that a window contain 10 data, and the dimension of each data is 5. I want to know how the TCN to work for the prediction, is to create a TCN for each dimension to make a separate prediction?
Looking forward to your reply.

Luux commented

Hi, I'm not sure what you mean by that? For prediction, you also input a time series just like in training. The shapes are the same as with a normal LSTM.
So for (None, 10, 5) you have 10 time steps where each has 5 features. The None is for the batch size/entry of the current sample within a minibatch, so you can ignore that and just look at the (10, 5). For prediction, it is the same; you just input the entire sequence.

When using the standard parameters, you'll get an output with 5 features. So when including the batch info, your input is (None, 10, 5) and the output is (None, 5). However, if you want to make seperate predictions for each time step, you can use return_sequences=True. Then your output shape will be (None, 10, 5) as well.

Hi, I'm not sure what you mean by that? For prediction, you also input a time series just like in training. The shapes are the same as with a normal LSTM.
So for (None, 10, 5) you have 10 time steps where each has 5 features. The None is for the batch size/entry of the current sample within a minibatch, so you can ignore that and just look at the (10, 5). For prediction, it is the same; you just input the entire sequence.

Thanks for your reply. For the predction of (None, 10, 5), I want to know whether it can be regarded as a TCN model with the same parameters to make separate predictions for these 5 dimensions. That is, a TCN model that makes separate predictions for 5 inputs of (None, 10, 1).

Luux commented

So if you want to make separate predictions where the features should be treated completely independent (-> should not influence/"know" each other entirely!) then you'd really have to create different TCN models. (@philipperemy please correct me if I'm wrong and there's some magic argument for that use case... :) )

If you do not want to treat them as completely independent features and just want 5 output dimensions as well, you can use one TCN normally.

@Luux thanks for helping and replying to Ding. I can add a few more things.
@dingfengqian Your input_dim dimension (5) example is similar to the case of an RGB image with 3 channels processed by a 2D convolution. Here we don't have 2D Conv-s but 1D Conv because we drop width and height and we just consider the time dimension. But the logic is the same.

So yes there will be some interactions. It is definitely not the case where we fetch one by one the input dimensions and run the TCN one each of them. There is no for i in range(5) do in the code. Inside the hood (inside the convolutional layers), you will find some matrices that will map input_dim to nb_filters (the ouput_dim of the convolution). So if you set nb_filters=64, you will find somewhere a matrix 5x64 (I omitted the kernel_size for clarity. If you consider kernel_size=3, then the matrix of your conv1D will be of shape 3x5x64).

IF YOU DO NOT WANT ANY INTERACTIONS: then, push the input_dim in the batch dimension, and run your TCN with input_dim=1. If you have (None, 10, 5), reshape it to (None * 5, 10, 1), call TCN on layer, obtain (None * 5, 1) [assuming it's a regression problem], and reshape it again to a list of 5 elements of shape (None, 1).

References:

Luux commented

IF YOU DO NOT WANT ANY INTERACTIONS: then, push the input_dim in the batch dimension, and run your TCN with input_dim=1. If you have (None, 10, 5), reshape it to (None * 5, 10, 1), call TCN on layer, obtain (None * 5, 1) [assuming it's a regression problem], and reshape it again to a list of 5 elements of shape (None, 1).

But only if the same filters/weights can be applied for each of these entries. If they are features that depict entirely different things that cannot be modeled with the same learned filters/weights, you need different TCNs of course. That depends on your data @dingfengqian :)

I'll close this issue for now as I think we gave ample information.