cyoon1729/RLcycle

A2C loss calculations

Closed this issue · 1 comments

https://github.com/cyoon1729/Reinforcement-learning/blob/d157d0d86c37734be4b430b7d311eb9bb0379d93/Policy-Gradient-Methods/a2c/a2c.py#L42

Sir, can you please explain why value_targets = rewards + discounted_rewards ?
Why do you need to add rewards and discounted rewards together ?

That was a mistake on my part, sorry for the late reply. I have fixed those changes in the recent refactoring!