keiohta/tf2rl

Fix categorical policy

keiohta opened this issue · 0 comments

The current implementation of CategoricalActor includes some bugs that should be solved. At least I found the following now:

  • input to the tfp.distributions.Categorical is wrong.
  • computation of log probability is wrong in call

There could be some other issues, so needed to evaluated on several discrete environments.