nachiket92/conv-social-pooling

Considering seq2seq model

Opened this issue · 3 comments

First of all, congratulations for this fantastic work!

Regarding the encoder-decoder architecture, have you ever considered using a seqquence-2-sequence model? In a seq2seq model, the decoder's input at t=n is the output of the decoder itself at t=n-1. I don't know whether it would be of any benefit, and it complicates the training and inference a little bit (typically, teacher forcing mode is used)

Thank you for your interest! Yes, a model where the encoding is used to initialize the decoder state could have been used as well.

First of all, congratulations for this fantastic work!

Regarding the encoder-decoder architecture, have you ever considered using a seqquence-2-sequence model? In a seq2seq model, the decoder's input at t=n is the output of the decoder itself at t=n-1. I don't know whether it would be of any benefit, and it complicates the training and inference a little bit (typically, teacher forcing mode is used)

Thank you for your interest! Yes, a model where the encoding is used to initialize the decoder state could have been used as well.

As you guys discussed above, I am a little confused about the output of decoder. The output should be over 5 seconds prediction horizon, but the output of decoder in the model.py file is the parameters of a bivariate Gaussian distribution at only one moment, and this is not consistent with the prediction task mentioned in the CVPR 2018 paper.

The decoder generates outputs over the 5 second horizon. The output will have size [batch_size, args['out_length'], 5]