issues on 'teacher force' implementation

Question

issues on 'teacher force' implementation

Closed this issue 4 years ago · 1 comments

Hi, thank u for sharing your code. I think your implementation is rather elegant, but there're still two issues on 'teacher force' part confusing me.

In models.py line 260, 'output' is defined as:
output = obs_traj_rel[-1]
Considering only on condition that 'training_step==3' the line 260 would be scaned, with 'obs_traj_rel' including coordinates in terms of both observation steps(length=8) and prediction steps(length=12), I couldn't comprehend why using the last coordinate of the prediction time steps as the input of the first lstm decoder cell(when 'teacher_force==False').
In models.py line 262, 'input_t' in each iteration is defined as:
input_t=obs_traj_rel[-self.pred_len :].chunk(
obs_traj_rel[-self.pred_len :].size(0), dim=0)[i]
Therefore, the value of 'input_t' is actually the ground truth of the i-th time step in prediction phase. Furthermore, when 'teacher_force==True', it also serves as the input of the i-th time step. However, according to Deep Learning by Ian Goodfellow, the input of the i-th rnn unit shoud be the lable of the (i-1)-th time step when teacher forcing is activated. Thus the implementation in this part is rather like an autoencoder than teacher-forcing training.

Could you please offer some explanations on these questions? Thx a lot ;)

Answer 1 · 2020-02-07T05:48:18.000Z

@Unbeliever98 I use the last relative coordinates in the observation step as the input of the first LSTM decode cell. This operation follows the SGAN model (Check here)

For the code input_t=obs_traj_rel[-self.pred_len :].chunk(obs_traj_rel[-self.pred_len :].size(0), dim=0)[i],
the len(obs_traj_rel) = obs_len + pred_len. And in my opinion, it is teacher-forcing.