vwxyzjn/cleanrl

PPO improvements

vwxyzjn opened this issue · 0 comments

Problem Description

The current PPO implementations can be improved in the following way.

changes that do not involve performance change

changes that do involve performance change (require re-running openrlbenchmark)