issues on 'teacher force' implementation
Closed this issue · 1 comments
Hi, thank u for sharing your code. I think your implementation is rather elegant, but there're still two issues on 'teacher force' part confusing me.
-
In models.py line 260, 'output' is defined as:
output = obs_traj_rel[-1]
Considering only on condition that 'training_step==3' the line 260 would be scaned, with 'obs_traj_rel' including coordinates in terms of both observation steps(length=8) and prediction steps(length=12), I couldn't comprehend why using the last coordinate of the prediction time steps as the input of the first lstm decoder cell(when 'teacher_force==False'). -
In models.py line 262, 'input_t' in each iteration is defined as:
input_t=obs_traj_rel[-self.pred_len :].chunk(
obs_traj_rel[-self.pred_len :].size(0), dim=0)[i]
Therefore, the value of 'input_t' is actually the ground truth of the i-th time step in prediction phase. Furthermore, when 'teacher_force==True', it also serves as the input of the i-th time step. However, according to Deep Learning by Ian Goodfellow, the input of the i-th rnn unit shoud be the lable of the (i-1)-th time step when teacher forcing is activated. Thus the implementation in this part is rather like an autoencoder than teacher-forcing training.
Could you please offer some explanations on these questions? Thx a lot ;)
@Unbeliever98 I use the last relative coordinates in the observation step as the input of the first LSTM decode cell. This operation follows the SGAN model (Check here)
For the code input_t=obs_traj_rel[-self.pred_len :].chunk(obs_traj_rel[-self.pred_len :].size(0), dim=0)[i]
,
the len(obs_traj_rel) = obs_len + pred_len
. And in my opinion, it is teacher-forcing.