iosband/TabulaRL

Reward in riverSwim

Opened this issue · 0 comments

The reward for talking action 0 in state 0 for riversim should be 0.005.
In the code however, it is R_true[0, 0] = (5 / 1000, 0).
Python 2.7 would treat 5/1000 as 0.
Please change it to 5.0/1000.0