Khrylx/PyTorch-RL

GAIL discriminator loss uses complete expert data in each iteration?

SapanaChaudhary opened this issue · 4 comments

discrim_criterion(e_o, zeros((expert_traj.shape[0], 1), device=device))

The number of generator data samples seems to be around 2088, while the number of expert samples is 50000. Shouldn't the number of expert samples be the same as that of generator's?

The BCELoss we use for the discriminator will divide the loss by the number of samples. So it should be fine.

That is right. So, the whole of expert data is used in each iteration. Is that fair? If I were to sample the same number of data samples (from complete pool of expert data) as that of generator's, how would you suggest I sample (uniformly random)?

Yes, maybe randomly sample a batch of expert data.

Okay. Thank you.