Reward Smoothing
WangChen100 opened this issue · 0 comments
WangChen100 commented
Hi, Arthur. How do you think about reward smoothing.
The collected rewards have high variance. In order to show the tendency of reward curve, should we do some reward smoothing operation as same as tensorboard smoothing?
If so, which smoothing method should I choose, exponential smoothing or average smoothing?