Actor update equation in DDPG
Closed this issue · 2 comments
Deleted user commented
Hi,
When you use the gradient of the critic to update the actor here, why do you put in the third parameter of tf.gradients() "-action_gdts" instead of "action_gdts". From where does the minus sign come ?
I double checked the formula and I still don't see why it is the case in your code.
Thanks!
germain-hug commented
Apologies for the late response, the negation here is a simple trick to compute the gradients using the loss function of the actor's network. This is similar to OpenAI's baselines
Deleted user commented
Thanks!