Fix categorical policy
keiohta opened this issue · 0 comments
keiohta commented
The current implementation of CategoricalActor includes some bugs that should be solved. At least I found the following now:
- input to the
tfp.distributions.Categorical
is wrong. - computation of log probability is wrong in
call
There could be some other issues, so needed to evaluated on several discrete environments.