keon/policy-gradient
Minimal Monte Carlo Policy Gradient (REINFORCE) Algorithm Implementation in Keras
PythonMIT
Issues
- 0
Train agent process error
#6 opened by SZH1230456 - 3
Incorrect normalising of discounted rewards
#5 opened by tall-josh - 2
- 1
Why normalize predicted probabilities?
#3 opened by abhigenie92 - 1
Minor Questions
#1 opened by abhigenie92