Rewards clipping or scaling
inoryy opened this issue · 1 comments
inoryy commented
Need to investigate if clipping or scaling rewards improves performance.
Does it even make sense if I'm already clipping grads?
How will the agent known that one action is better than other if both get reward = 1?
inoryy commented
Got an answer from DM - they don't use any reward clipping/scaling