Considering seq2seq model

Question

Considering seq2seq model

Opened this issue 5 years ago · 3 comments

First of all, congratulations for this fantastic work!

Regarding the encoder-decoder architecture, have you ever considered using a seqquence-2-sequence model? In a seq2seq model, the decoder's input at t=n is the output of the decoder itself at t=n-1. I don't know whether it would be of any benefit, and it complicates the training and inference a little bit (typically, teacher forcing mode is used)

Answer 1 · 2019-04-17T22:38:58.000Z

Thank you for your interest! Yes, a model where the encoding is used to initialize the decoder state could have been used as well.

Answer 2 · 2019-09-11T01:49:24.000Z

First of all, congratulations for this fantastic work!

Regarding the encoder-decoder architecture, have you ever considered using a seqquence-2-sequence model? In a seq2seq model, the decoder's input at t=n is the output of the decoder itself at t=n-1. I don't know whether it would be of any benefit, and it complicates the training and inference a little bit (typically, teacher forcing mode is used)

Thank you for your interest! Yes, a model where the encoding is used to initialize the decoder state could have been used as well.

As you guys discussed above, I am a little confused about the output of decoder. The output should be over 5 seconds prediction horizon, but the output of decoder in the model.py file is the parameters of a bivariate Gaussian distribution at only one moment, and this is not consistent with the prediction task mentioned in the CVPR 2018 paper.

Answer 3 · 2019-09-11T17:44:40.000Z

The decoder generates outputs over the 5 second horizon. The output will have size [batch_size, args['out_length'], 5]