awjuliani/DeepRL-Agents

Reward Smoothing

WangChen100 opened this issue · 0 comments

Hi, Arthur. How do you think about reward smoothing.
The collected rewards have high variance. In order to show the tendency of reward curve, should we do some reward smoothing operation as same as tensorboard smoothing?
If so, which smoothing method should I choose, exponential smoothing or average smoothing?