Calculating the kl loss seems has a mistake.

Question

Calculating the kl loss seems has a mistake.

Nightbringers opened this issue a year ago · 1 comments

code:
kl_div_loss = masked_kl_div(action_probs, old_action_probs, mask = action_masks) * self.kl_div_loss_weight

I think old_action_probs should be y(true), action_probs should be y(pred),i think the right code should be this:
kl_div_loss = masked_kl_div(old_action_probs, action_probs, mask = action_masks) * self.kl_div_loss_weight

Am I right？or Im misunderstanding.

Answer 1 · 2023-03-22T13:22:48.000Z

no i think you may be correct, will make the change! 🙏