How do 1 and -1 reward be used?

Question

How do 1 and -1 reward be used?

guotong1988 opened this issue 8 years ago · 1 comments

guotong1988 commented 8 years ago

I find from here that all the rewards are add into the deque. We need to sample the 1 and -1 reward from the deque to use them. So do you think it may be slow.

In Chinese：是不是reward为1和-1的情况也都放在deque里，那么reward为1和-1的被sample出来的几率岂不是很低，反馈就会很慢？

@songrotek Thank you.

Answer 1 · 2017-04-12T11:27:29.000Z

yenchenlin/DeepLearningFlappyBird#32